| Pages: [1] :: one page |
| Author |
Thread Statistics | Show CCP posts - 0 post(s) |

AsfALT
Republic Military School
|
Posted - 2008.04.30 08:31:00 -
[1]
Hi, As i don't know how to put this i will just tell you what i have to do:
I need do compare sort sort huge amounts of data from txt files. The data is structured.
I can import the data in Xcell but i don't know how to compare data from different sheets/files (there is so much that i will need many sheets).
If anyone knows a way/a software (free) to do this I would be gratefull. Thnx.
|

AsfALT
Republic Military School
|
Posted - 2008.04.30 08:31:00 -
[2]
Hi, As i don't know how to put this i will just tell you what i have to do:
I need do compare sort sort huge amounts of data from txt files. The data is structured.
I can import the data in Xcell but i don't know how to compare data from different sheets/files (there is so much that i will need many sheets).
If anyone knows a way/a software (free) to do this I would be gratefull. Thnx.
|

AsfALT
Republic Military School
|
Posted - 2008.04.30 13:37:00 -
[3]
Bump... WELP!
|

Tarminic
Black Flame Industries
|
Posted - 2008.04.30 13:45:00 -
[4]
You'll have to be more specific? What is this data and how exactly do you need it to be sorted? I could potentially write an application or web script to do it for you, but I'd need details.  ---------------- Tarminic - 35 Million SP in Forum Warfare Play EVE: Downtime Madness v0.81 (Updated 4/8) |

ry ry
StateCorp Insurgency
|
Posted - 2008.04.30 13:47:00 -
[5]
Edited by: ry ry on 30/04/2008 13:50:33
if you want to suck it all into excel, it should be easy enough to do in VB.
presumably you have a whole bunch of comma/tab delimited text files that you want to combine into one giant spreadsheet and sort by one of the columns?
stick up an example of how the data is structured in each file, and what exactly you want to do with it and i'm sure somebody will be able to help.
personally i'm hoping it involves multidimensional arrays of some kind }:)
|

Tarminic
Black Flame Industries
|
Posted - 2008.04.30 14:20:00 -
[6]
Originally by: ry ry if you want to suck it all into excel, it should be easy enough to do in VB.
But keep in mind that using VB for anything practical causes genital warts and hairy palms. ---------------- Tarminic - 35 Million SP in Forum Warfare Play EVE: Downtime Madness v0.81 (Updated 4/8) |

Imperator Jora'h
|
Posted - 2008.04.30 14:26:00 -
[7]
As others have noted you are not clear enough on exactly what you want to do.
If all you want to do is sort on various columns then your task is simplicity itself.
Moving the data into Excel depends how it is now. If the text is all in one column then a simple copy/paste into a column in Excel will do. If the text is all one thing after another in essentially a huge paragraph then you need to do an import (this assumes each item is separated by a comma making it a comma delimited file).
Give each column a heading then highlight all the data and have Excel make it into a Table.
Voila...you can sort on any column you like. I have the latest version of Excel and it is particularly good at this with even more bells and whistles to Tables but the older versions of Excel should do fine.
If you are trying to do a comparison on various fields and you have many, many fields your task becomes a good deal more difficult.
-------------------------------------------------- "Of course," said my grandfather, pulling a gun from his belt as he stepped from the Time Machine, "there's no paradox if I shoot you!"
|

Ryysa
Sharks With Frickin' Laser Beams Mercenary Coalition
|
Posted - 2008.04.30 15:36:00 -
[8]
note, he said "huge" amount of data.
Excel is very limited by data size...
EW Guide - KB Tool - My Music |

AsfALT
Republic Military School
|
Posted - 2008.04.30 19:40:00 -
[9]
Hi, Thank you for your replays.
I have indeed huge amounts of data (in the hundret of thousands, maybe over 1 million entrys).
This data is in a few txt files. I have to combine them in an excel file. The trick is that it has to be filterd, i need only one occurance of an entry (all entrys have to be unique).
If anyone can help it would be great. I saw an app that would compare excel files but it was not free...
|

Denton Frost
Amarr KSI
|
Posted - 2008.04.30 19:46:00 -
[10]
GTFO
|

AsfALT
Republic Military School
|
Posted - 2008.04.30 19:48:00 -
[11]
Originally by: Denton Frost
GTFO
What is your porblem?
|

Joseph 9
Digital Fury Corporation Digital Renegades
|
Posted - 2008.04.30 20:55:00 -
[12]
Edited by: Joseph 9 on 30/04/2008 20:56:59
Originally by: Denton Frost
GTFO
Ermmm, why should he do that?
Could I suggest you provide us with some small sample files? If the data is sensitive in some way maybe fake something similar we can look at. There are a fair few internal functions in Excel that can be used to manipulate data that may be of use to you but it's hard to say.
Edit
Paying attention ftw
Excel is no use for anything beyond 64,000 datapoints iirc. 32,000 in some cases. Frankly your going to be better off writing a small programme to do it. Provided the data is nicely structure pretty much any language will do the job, even VB...
|

AsfALT
Republic Military School
|
Posted - 2008.04.30 21:26:00 -
[13]
Edited by: AsfALT on 30/04/2008 21:26:48
Originally by: Joseph 9 Edited by: Joseph 9 on 30/04/2008 20:56:59
Originally by: Denton Frost
GTFO
Ermmm, why should he do that?
Could I suggest you provide us with some small sample files? If the data is sensitive in some way maybe fake something similar we can look at. There are a fair few internal functions in Excel that can be used to manipulate data that may be of use to you but it's hard to say.
Edit
Paying attention ftw
Excel is no use for anything beyond 64,000 datapoints iirc. 32,000 in some cases. Frankly your going to be better off writing a small programme to do it. Provided the data is nicely structure pretty much any language will do the job, even VB...
Sadly i must deliver the porcessed data in excel format...
Each file is structured like this (1 coloum):
file1: entry1 entry2 entry3
file2: entry4 entry1 entry2
My task is to remove duplicates and deliver the data in .xls format. So i have to be able to compare all the entrys in these txt files (each contains about 100k entrys) and then output them in many txt files no larger then 60k (for safe import in excel).
|

Isiskhan
Gnostic Misanthropy
|
Posted - 2008.04.30 21:53:00 -
[14]
Assuming your files are named after the pattern "file[number]", copy them into a Linux or MacOS box, open up a terminal and execute this:
cd [directory where you copied your files] mkdir output cat file* | sort | uniq | split -l 5000 - output/fileout
Substitute the number '5000' by the number of lines you want in each of your output files.
What you'll get in the subdirectory "output" is a set of files named "fileoutaa", "fileoutab", "fileoutac", etc... containing the ordered list of your entries with all duplicates removed and broken into a maximum of 5000 entries per file.
You can then open these in Excel as if they were CSV files.
|

AsfALT
Republic Military School
|
Posted - 2008.04.30 22:38:00 -
[15]
Originally by: Isiskhan Edited by: Isiskhan on 30/04/2008 22:04:58 Assuming your files are named after the pattern "file[number]", copy them into a Linux or MacOS box, open up a terminal and execute this:
cd [directory where you copied your files] mkdir output cat file* | sort | uniq | split -l 5000 - output/fileout
Substitute the number '5000' by the number of lines you want in each of your output files.
What you'll get in the subdirectory "output" is a set of files named "fileoutaa", "fileoutab", "fileoutac", etc... containing the ordered list of your entries with all duplicates removed and broken into a maximum of 5000 entries per file.
You can then open these in Excel as if they were CSV files. If you want you can also add to them a ".csv" extension like this (to skip the step of having to tell Excel to show you all files):
cd output for f in fileout*; do mv $f $f.csv; done
Is there such an app for windows?
|

Isiskhan
Gnostic Misanthropy
|
Posted - 2008.04.30 23:07:00 -
[16]
All those are system tools Linux / MacOS / Unix come with by default. I believe there are implementations of these for Windows (though the command line syntax will be slightly different).
But I can't help you with that because any time I have to deal Windows beyond launching games, my blood pressure rises and I end up getting the urge to ritualistically disembowel Steve Ballmer, so my doctor has strongly advised I stay away from it.
I'm sure someone else (or some googling) can help you with installing and running these tools on a Windows environment.
|

AsfALT
Republic Military School
|
Posted - 2008.04.30 23:41:00 -
[17]
Edited by: AsfALT on 30/04/2008 23:45:33
Originally by: Isiskhan Edited by: Isiskhan on 30/04/2008 22:04:58 Assuming your files are named after the pattern "file[number]", copy them into a Linux or MacOS box, open up a terminal and execute this:
cd [directory where you copied your files] mkdir output cat file* | sort | uniq | split -l 5000 - output/fileout
Substitute the number '5000' by the number of lines you want in each of your output files.
What you'll get in the subdirectory "output" is a set of files named "fileoutaa", "fileoutab", "fileoutac", etc... containing the ordered list of your entries with all duplicates removed and broken into a maximum of 5000 entries per file.
You can then open these in Excel as if they were CSV files. If you want you can also add to them a ".csv" extension like this (to skip the step of having to tell Excel to show you all files):
cd output for f in fileout*; do mv $f $f.csv; done
VICTORY!
Thank you!
I found this app, Cygwin, and i used your command.
EDIT: i can go to bed now as it's 2.44 AM here... :D
Thnx again
|
| |
|
| Pages: [1] :: one page |
| First page | Previous page | Next page | Last page |