List folder files the fastest way

Hi, currently i'm in need of list all the files in a folder with over 20k files (folder will have more files with the time), and then create a csv file with all the folder file names. But i'm kinda stuck since it need to be the fastest way possible, and all the ways i think about, are basically sequential.
32 Replies
JavaBot
JavaBot5mo ago
This post has been reserved for your question.
Hey @keplerk! Please use /close or the Close Post button above when your problem is solved. Please remember to follow the help guidelines. This post will be automatically marked as dormant after 300 minutes of inactivity.
TIP: Narrow down your issue to simple and precise questions to maximize the chance that others will reply in here.
dan1st
dan1st5mo ago
If you want a way that's incremental, you could try something like
PrintWriter csvWriter = ...;
try(Stream<Path> fileStream = Files.list(Path.of("path/to/your/folder"))) {
csvWriter.forEach(path ->
//your operation
csvWriter.println(path.getName())//possibly also other csv data
);
}
PrintWriter csvWriter = ...;
try(Stream<Path> fileStream = Files.list(Path.of("path/to/your/folder"))) {
csvWriter.forEach(path ->
//your operation
csvWriter.println(path.getName())//possibly also other csv data
);
}
that's probably not the fastest though
keplerk
keplerkOP5mo ago
Yeah that's one of the ways i found out, but at the edn it's iterating one by one. At the end*
dan1st
dan1st5mo ago
oh I thought you meant you wanted to start working before all entries are retrieved Well the OS only tells you the directly entries one after another
keplerk
keplerkOP5mo ago
I basically need to grab all the file names on that folder and put them in a csv file
dan1st
dan1st5mo ago
you need to iterate over it in some way
keplerk
keplerkOP5mo ago
I thought on maybe use threads to "itierate" them in batch or something. But i just can't come with an idea of how to do it
dan1st
dan1st5mo ago
that would just make it slower because you would need to synchronize all accesses to a point where you could only do one thing at once
keplerk
keplerkOP5mo ago
:NOOO:
dan1st
dan1st5mo ago
theoretically, there might be ways to speed up the writing using memory mapping
keplerk
keplerkOP5mo ago
Basically relying on java.nio
dan1st
dan1st5mo ago
well first you need to figure out what the actual bottleneck is is it reading the list of files or writing the CSV that's slow? hint: use a profiler
keplerk
keplerkOP5mo ago
The problem is to get all the files into an array
dan1st
dan1st5mo ago
Why in an array?
keplerk
keplerkOP5mo ago
"list them"
dan1st
dan1st5mo ago
you don't need to put them in an array
keplerk
keplerkOP5mo ago
That's the part i'm stuck in
dan1st
dan1st5mo ago
just process them one after another once they are ready that approach doesn't need an array
keplerk
keplerkOP5mo ago
Yeah but processing 20k+ files will still take more time that i would want. (Tbh i'm testing this approach right now by suggestion of the google AI)
dan1st
dan1st5mo ago
then you need to use a profiler to figure out what exactly is the bottleneck what exactly are you doing with the names and how?
keplerk
keplerkOP5mo ago
Basically this 1.- get all the file names 2.- create a csv file with all the names That's all getting the stream from the stream path and with the for each, writing the file name in the printwritter
dan1st
dan1st5mo ago
So the CSV contains only one row?
keplerk
keplerkOP5mo ago
1 column
dan1st
dan1st5mo ago
yeah column
keplerk
keplerkOP5mo ago
Yes
dan1st
dan1st5mo ago
then show me the code you are using and the profiling results
keplerk
keplerkOP5mo ago
Path _dir = Paths.get(SHARE_SIMULAT0R_DRIVE_PATH);
try (PrintWriter writer = new PrintWriter(new FileWriter(newFilePath))) {
writer.println("Filename");
try (Stream<Path> stream = Files.list(_dir)) {
stream.filter(Files::isRegularFile)
.forEach(file -> {
if (!file.endsWith(".xml") && file.startsWith("-")) System.out.println("File " + file + " starts with hyphen or ends with .xml, ignoring.");
else writer.println(file);
});
} catch (IOException ex) {
System.out.println(ex.getMessage());
}
} catch (IOException ex) {
System.out.println(ex.getMessage());
}

Path _dir = Paths.get(SHARE_SIMULAT0R_DRIVE_PATH);
try (PrintWriter writer = new PrintWriter(new FileWriter(newFilePath))) {
writer.println("Filename");
try (Stream<Path> stream = Files.list(_dir)) {
stream.filter(Files::isRegularFile)
.forEach(file -> {
if (!file.endsWith(".xml") && file.startsWith("-")) System.out.println("File " + file + " starts with hyphen or ends with .xml, ignoring.");
else writer.println(file);
});
} catch (IOException ex) {
System.out.println(ex.getMessage());
}
} catch (IOException ex) {
System.out.println(ex.getMessage());
}

This is how i'm doing it No profiling result since i just coded it Hence not yet tested
dan1st
dan1st5mo ago
use buffering for the writer try (PrintWriter writer = new PrintWriter(Files.newBufferedWriter(newFilePath))) {
roy
roy5mo ago
.
JavaBot
JavaBot5mo ago
💤 Post marked as dormant
This post has been inactive for over 300 minutes, thus, it has been archived. If your question was not answered yet, feel free to re-open this post or create a new one. In case your post is not getting any attention, you can try to use /help ping. Warning: abusing this will result in moderative actions taken against you.
keplerk
keplerkOP5mo ago
In case any is interested, i ended up creating a c++ exe that used windows api to read the folder and list the files to next create a csv file. And from java call it with ProcessBuilder the result was listing and creating a csv file with 23k lines in about 6 to 8 seconds.
JavaBot
JavaBot5mo ago
Post Closed
This post has been closed by <@306591040406683649>.

Did you find this page helpful?