Friday, February 11, 2011

Easiest way to get file's contents in C

What is the simplest way (least error-prone, least lines of code, however you want to interpret it) to open a file in C and read its contents into a string (char*, char[], whatever)?

  • I tend to just load the entire buffer as a raw memory chunk into memory and do the parsing on my own. That way I have best control over what the standard lib does on multiple platforms.

    This is a stub I use for this. you may also want to check the error-codes for fseek, ftell and fread. (omitted for clarity).

    char * buffer = 0;
    long length;
    FILE * f = fopen (filename, "rb");
    
    if (f)
    {
      fseek (f, 0, SEEK_END);
      length = ftell (f);
      fseek (f, 0, SEEK_SET);
      buffer = malloc (length);
      if (buffer)
      {
        fread (buffer, 1, length, f);
      }
      fclose (f);
    }
    
    if (buffer)
    {
      // start to process your data / extract strings here...
    }
    
    Chris Bunch : Awesome, that worked like a charm (and is pretty simple to follow along). Thanks!
    freespace : I would also check the return value of fread, since it might not actually read the entire file due to errors and what not.
    rmeador : Along the lines of what freespace said, you might want to check to ensure the file isn't huge. Suppose, for instance, that someone decided to feed a 6GB file into that program...
    Chris Bunch : Definitely, just like Nils said originally, I'm going to go look up the error codes on fseek, ftell, and fread and act accordingly.
    dicroce : Seeking to the end just so you can call ftell? Why not just call stat?
    KPexEA : like rmeador said, fseek will fail on files >4GB.
    Nils Pipenbrinck : True. For large files this solution sucks.
    Nils Pipenbrinck : I haven't suggested using stat simply because it's not ANSI C. (At least I think so). Afaik the "recommended" way to get a file-size is to seek to the end and get the file offset.
    Dan : This is good and easy... but it will choke if you need to read from a pipe rather than an ordinary file, which is something that most UNIX programs will want to do at some point.
    Tim : Don't forget to free the buffer when you are done.
  • "simplest way" and "least error-prone" are often opposites of each other.

  • If "read its contents into a string" means that the file does not contain characters with code 0, you can also use getdelim() function, that either accepts a block of memory and reallocates it if necessary, or just allocates the entire buffer for you, and reads the file into it until it encounters a specified delimiter or end of file. Just pass '\0' as the delimiter to read the entire file.

    This function is available in the GNU C Library, http://www.gnu.org/software/libc/manual/html_mono/libc.html#index-getdelim-994

    The sample code might look as simple as

    char* buffer = NULL;
    ssize_t bytes_read = getdelim( &buffer, 0, '\0', fp);
    if ( bytes_read != -1) {
      /* Success, now the entire file is in the buffer */
    
    ephemient : I've used this before! It works very nicely, assuming the file you're reading is text (does not contain \0).
    From dmityugov
  • Another, unfortunately highly OS-dependent, solution is memory mapping the file. The benefits generally include performance of the read, and reduced memory use as the applications view and operating systems file cache can actually share the physical memory.

    POSIX code would look like:

        int fd = open("filename", O_RDONLY);
        int len = lseek(fd, 0, SEEK_END);
        void *data = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
    

    Windows on the other hand is little more tricky, and unfortunately I don't have a compiler in front of me to test, but the functionality is provided by CreateFileMapping() and MapViewOfFile().

    From Jeff Mc
  • If the file is text, and you want to get the text line by line, the easiest way is to use fgets().

    char buffer[100];
    FILE *fp = fopen("filename", "r");                 // do not use "rb"
    while (fgets(buffer, sizeof(buffer), fp)) {
    ... do something
    }
    fclose(fp);
    
    From selwyn

0 comments:

Post a Comment