#48881 - 07/31/14 07:01 PM
Image extraction from a file?
|
OL Newbie
Registered: 09/22/11
Posts: 16
|
I am working with a company the does statements for a bank. The data that I get is in XML format for the statements, and also includes a file of images. The images are in tiff format, with 44 characters of header information preceding each image in this file. In an ideal world, Planet Press would look at the XML of the statement data, which includes the header information, and extract the image from the file and place it on the page. Seeing as these are check images, and I want it to be multiple on a page, I need the extracted image to print, step right, print, step left and down, print, etc until a page is filled and then go to the next page. Anyway, I guess the most important part is... can PP take this image blob and extract an image out of it? If so, where should I be looking to process it? Thanks!
|
Top
|
|
|
|
#48882 - 08/01/14 11:02 AM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Expert
Registered: 07/23/08
Posts: 152
Loc: Flint, Michigan
|
So do you have tifs or do you actually have something that is no longer a valid tif because somebody stuck 44 characters of header information inside the file?
Do you have a couple dummy records that you could post somewhere to look at?
Repeating the same image multiple times in varying positions is straight forward. It is just a matter of how to get the header and image out of the xml?
|
Top
|
|
|
|
#48888 - 08/01/14 05:58 PM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Newbie
Registered: 09/22/11
Posts: 16
|
Not sure how well this will copy/paste, but I will trim it for clarity sake.
00000000000000102008F 00000259301004001599II*0»»ò`
`¬ %(]Ñfi
{{insert gibberish here}}
C$Y∂4C1óQ$±∏€Ä00000000000000102010F 00000259301004001602II*0»»∞(
(<B%(
The first bit there is the start of the file, so 44 characters from there is the header, then the gibberish of a tif file, then the next header, gibberish, and so on. So as best as I can see, I need to take a bit of data from my main xml file to match the last bit of the numbering, split it from the beginning of the 44 characters, and continue to the next set of 44. The tif data is all good, just mashed together, and if possible I'd like to not have to split these apart. The mailings we will be doing will be about 4000 pieces each, and assuming an average of 4 checks each (just pulling a number out), I could easily end up with 16K images, for the small mailings, and who knows how many for the quarterly mailings (if they have images at all). If needed, I can link the sample files I am working from to provide additional clarity. Thanks for taking a gander!
|
Top
|
|
|
|
#48891 - 08/04/14 10:07 AM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Guru
Registered: 07/03/12
Posts: 106
|
can't you just write the contents of your node to a jobinfo, use it in create file and go from there. once processed the tiffs could be uploaded to the virtual drive or saved to the local har disk to be used within design.
usually though, what you would expect to see is base64 for example.. is that tiff data held within a CDATA xml tag-> or are they simply dumping binary data into a standard xml node??
|
Top
|
|
|
|
#48892 - 08/04/14 10:35 AM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Newbie
Registered: 09/22/11
Posts: 16
|
All of the image data is in a separate file (currdate.img), *not* in the XML itself. The only information in the XML is the 'filename' as such that would be used to locate a given image in the image data file.
Also, just to be clear, I do get these images as a separate file, but 1 file with all of them tied together, not many individual files. I am trying to avoid having to write a script to break the images apart, thus my questions.
Edited by Chris S. (08/04/14 10:44 AM)
|
Top
|
|
|
|
#48893 - 08/04/14 10:43 AM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Guru
Registered: 07/03/12
Posts: 106
|
can you run the file through a generic data splitter and use split data file on a word, with 'the word' being your header?
it seems the key here would be to split up your .img file, and PlanetPress certainly has the tools for that.. You might end up having to insert a form feed using regular expressions with the search and replace plugin and split on that but i'm sure it can be done at the end of the day.
Perhaps if you were to post the .img file to support then you could get help with that?
|
Top
|
|
|
|
#48894 - 08/04/14 12:27 PM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Newbie
Registered: 09/22/11
Posts: 16
|
Here is a link to my sample XML and image files. https://www.dropbox.com/sh/8qyrmuopbkq8wie/AABmLMr_R8DS6UpkHPyFUn2KaI had not thought about using the splitter, and that might be an option. The tricky part as far as I can see is that the 44 characters don't have a beginning delimiter, so I'm not sure if that would work or not.
Edited by Chris S. (08/04/14 12:29 PM)
|
Top
|
|
|
|
#48895 - 08/04/14 12:50 PM
Re: Image extraction from a file?
[Re: ppuserd]
|
OL Expert
Registered: 10/14/05
Posts: 4956
Loc: Objectif Lune Montreal
|
can't you just write the contents of your node to a jobinfo, use it in create file and go from there. once processed the tiffs could be uploaded to the virtual drive or saved to the local har disk to be used within design.
usually though, what you would expect to see is base64 for example.. is that tiff data held within a CDATA xml tag-> or are they simply dumping binary data into a standard xml node??
The thing is that you'd have to also remove the header that is present in the data stream. And it's not a 100% safe method: if, during the copying of the image stream, some of the "garbage" characters cannot be properly copied, then it will create an invalid image file. MAYBE it will work... but it's not something that we can guarantee.
|
Top
|
|
|
|
#48900 - 08/05/14 06:40 AM
Re: Image extraction from a file?
[Re: Chris S.]
|
OL Guru
Registered: 07/03/12
Posts: 106
|
i don't think this is going to work, unless you can find a way to edit the file and create a single image from it manually first.
how do you know if they are valid tiff files? were you able to succeed in doing the above? -> i failed miserably..
more generally, i would push to get the images in a recognised format such as .tif or .jpg for example.
|
Top
|
|
|
|
|
|