Page 1 of 2 1 2 >
Topic Options
#58037 - 06/10/21 12:26 PM How to split a large data file by States
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
Hi team, I'm currently using (Emulation Data Spliter) page by page which works but this process takes a long time to split 7 States for the 150K records!

Is there any scrip where I can split the data file by States? or any other methods that could be faster to split? please let me know.

THANKS
_________________________
Peace

Top
#58045 - 06/14/21 01:52 PM Re: How to split a large data file by States [Re: Sami786]
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
can someone help me to update below code to split main data file by multiple wordings.

Example: Search line by line If the word "DEF" found then save them in the DEF data file until all found, and the same search for additional 6 more words to search line by line until the desired strings are found, then copy the entire line into desired split files.

'when this scripts completed, then I expect to have 7 individual split files !

' Variables declarations.
Option Explicit
const cRead=1, cWrite=2, cAppend=8
dim FSO
set FSO=CreateObject("Scripting.FileSystemObject")
dim fileInput, fileOutput, stringtosearch,strArray

' Open the current input file
Set fileInput=FSO.OpenTextFile(Watch.GetJobFilename, cRead)

' Loop through the lines. Do a search and extract desired line of data and save them into another file.

do while not (fileInput.AtEndOfStream)
stringtosearch = fileInput.ReadLine()

if ((mid(stringtosearch, 1, 10) = "ABC")) then
objOutputFile.WriteLine(sLine)
end if

if ((mid(stringtosearch, 1, 10) = "DEF")) then
objOutputFile.WriteLine((sLine)
end if
loop

objInputFile.Close
objOutputFile.Close
objFSO.DeleteFile Watch.GetJobFilename, true



Edited by Sami786 (06/14/21 01:58 PM)
_________________________
Peace

Top
#58049 - 06/15/21 10:55 AM Re: How to split a large data file by States [Re: Sami786]
Philippe F. Offline
OL Expert

Registered: 09/06/00
Posts: 1968
Loc: Objectif Lune, Montreal, Qc
Something like this should work. It's JavaScript, though, I haven't coded in VBScript for years.

Code:
var FSO = new ActiveXObject("Scripting.FileSystemObject");

var inputFile = FSO.OpenTextFile(Watch.GetJobFileName(),1);
var outputFiles = {};
var currentType = "";
var lineType = "";
var line="";

while (!inputFile.AtEndOfStream) {
  line     = inputFile.ReadLine();
  lineType = line.slice(0,3);
  // Check if property exists, otherwise create it and assign
  // a newly created file handle to it
  if(!outputFiles[lineType]){
    outputFiles[lineType] = FSO.CreateTextFile("C:\\Tests\\Output\\" + lineType + ".txt");
  }
  outputFiles[lineType].WriteLine(line);
}

//Close all files
for(var i in outputFiles) outputFiles[i].Close();


The code creates an object called outputFiles, and inside that object a property is dynamically created for each type of file that needs to be created (i.e. DEF, ABC, GHI, etc.). All files remain open until the input file has been processed, they are all closed at the very end.

On my system, I processed 300,000 lines in about 4 seconds, generating 3 different output files.
_________________________
Technical Product Manager
I don't want to achieve immortality through my work; I want to achieve immortality through not dying - Woody Allen

Top
#58054 - 06/15/21 01:12 PM Re: How to split a large data file by States [Re: Sami786]
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
Thank you Philippe
_________________________
Peace

Top
#58055 - 06/15/21 02:02 PM Re: How to split a large data file by States [Re: Sami786]
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
I get an Error below
Line 1, column 28, Expected end of statement

I don't know Java, can you please let me know why do I get this error?

Also I assum below line is where I can assign the data location to look for ABC or ....

lineType = line.slice(0,3);

in my case it's in line 4 so I will use (4,3) correct?
_________________________
Peace

Top
#58064 - 06/17/21 01:04 PM Re: How to split a large data file by States [Re: Sami786]
Philippe F. Offline
OL Expert

Registered: 09/06/00
Posts: 1968
Loc: Objectif Lune, Montreal, Qc
  1. It's JavaScript, not Java
  2. The error message is probably due to the fact that you didn't tell Workflow that this script is JavaScript. Open the task and select the proper option from the Language menu
  3. The script reads each individual line, so you don't have to specify on which line to look for the data. If you meant the "column" however, then yes, the slice(0,3) statement is where you specify it.
    Since Javascript indexes are 0-based, the first character is at index 0. In your case, the 4th character is at index 3, so the statement would be
    slice(3,3)
_________________________
Technical Product Manager
I don't want to achieve immortality through my work; I want to achieve immortality through not dying - Woody Allen

Top
#58067 - 06/21/21 02:29 PM Re: How to split a large data file by States [Re: Sami786]
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
Hi Philippe,
thanks for your guidance, now I got the script working but the indexing is not pulling the correct name from data locations.

I have two option to split data that I need your help with:

Option 1: address block in reading from data location 2-5

Addr1 = Mr. Sample Sample
Addr2 = Company Name
Addr3 = 99 Sample Street
Addr4 = Toronto ON X9X 9X9 = City State Postal_Code (all in one line separated by space)

as you said, the script reads each individual line to look for the data(state names).

Now my confusion is that how does the indexing works to find each State, since they are all in one line separated by a space? and we could have 3 line address block or 4 line address blocks?

---------------------

Optiion 2:
The State name is also spelled out in line 20 like below
Newfoundland
New Brunswick
Ontario
Quebec
Saskatchewan
....
...
...

I think option 2 might be the best option to split by State name but how can I use indexing to split by line 20 ? I think I might need an if statement for each state names?

I used indexing as below but I get wrong results.

lineType = line.slice(20,100);

I need to let the indexing know to grab the value in line 20 and split by states names?

Thanks
_________________________
Peace

Top
#58071 - 06/22/21 01:09 PM Re: How to split a large data file by States [Re: Sami786]
Philippe F. Offline
OL Expert

Registered: 09/06/00
Posts: 1968
Loc: Objectif Lune, Montreal, Qc
Again, the slice() method looks at COLUMNS not at lines.
You don't have to worry about line numbers since the script reads them one at a time!

But now your requirements are completely different. You initially said the string to search for was found on each line. Now it is found only on certain lines. It would have been easier if you had mentioned you were looking for state/provinces inside your file.

This will require a different type of script. It won't be much more complex than the one above, but without some sample data, I'm not going to spend time writing something that may not work and that will require several back and forths through this forum.

I would suggest you either post some sample data (and describe it) or you open up a call with our professional services staff who will certainly be able to help you out in less than an hour.
_________________________
Technical Product Manager
I don't want to achieve immortality through my work; I want to achieve immortality through not dying - Woody Allen

Top
#58072 - 06/22/21 04:40 PM Re: How to split a large data file by States [Re: Sami786]
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
Hi Philip, sorry for the confusion, below is my test data file that looks exactly as client data, if this is too much please let me know to open up a call.

Sample data: Comma delimiter data file
Header:
clientID,fname,lname,company,addr1,city State Postal Code,Language,State Name,barcode

A-10009,Simon,White,ABC limited,99 Sample St,Toronto ON X9X 9X9,E,Toronto,00004001
B-10010,David,Strong,Digital Sign,10 Station St,Collingwood NB X9X 9X9,E,New Brunswick,00000088
G-10011,Alix,McFee,XDT printing,190 Short St,Oakville NL X9X 9X9,E,Newfoundland,00000009

the above 3 records are samples, City State pcode is provided in one filed but State name is also populated in filed 8, I think it would be best to split by "State Name".

to split this data file we should get 3 split files as below:
Toronto.csv
New Bruswick.csv
Newfoundland.csv

THANK YOU


Edited by Sami786 (06/23/21 09:27 AM)
_________________________
Peace

Top
#58081 - 06/28/21 03:40 PM Re: How to split a large data file by States [Re: Sami786]
Sami786 Offline
OL Expert

Registered: 01/29/14
Posts: 393
Loc: Home
Hi, sample data is described above, let me know when you get a chance. thanks
_________________________
Peace

Top
Page 1 of 2 1 2 >