Hi there,
Something like this might do the trick...
Code:
find / -xdev -exec file -b {} \; -printf '%f\n%h\n%b\0' > index
It would produce a file called "index" containing null-separated records, with the fields you're looking for delimited by line breaks. There are lots of ways of doing this sort of thing, but this one seemed reasonably robust & efficient. Creating some sort of intermediate temporary file means you can re-run the SQL conversion without having to re-scan your entire filesystem, in the event you encounter problems.
Some other suggestions...
- It's worth taking a moment to consider what user you want to run your index-building command as. A reasonably privileged user would be a good idea ... but perhaps not so privileged as to allow users of your index to access file meta-information they shouldn't be allowed to see.
- If you're going to automate this procedure, nicing it pretty heavily might be a good idea. Something along the lines of nice -n 19 find ... would help reduce the performance impact of running such an I/O-intensive command in the background, while people are using your box.
- I wasn't quite sure what you meant by "size" and "type" in your o/p. In the method I suggested, I opted for size = the amount of disk space allocated to each file, and type = a wordy description of each file's contents. You might, for example, prefer to use the actual file length & its MIME type.
- When you're converting to SQL, be sure to take precautions over special characters (eg ' [apostrophe]). If, for example, you're using MySQL, you might want to pass the entire temporary index file through sed, and substitute \' [backslash/apostrophe] for ' [apostrophe], and so on.
Anyhow, I hope that gives you some ideas.
Bookmarks