Saturday, February 19, 2011

How can I sort an array returned from File.ReadAllLines on an alphabetical member?

I am reading a .csv file and returning its lines in string array. One of the members is manufacturer, for which I have Toyota, Ford, etc.

I want to sort an array (Can be another collection) of the rows, by the value in manufacturer and alphabetical order.

So I'd have:

28437 Ford Fiesta
328   Honda Civic
34949 Toyota Yaris

and so forth...

What would be the best way to do this using C# and no database? I say no database because I could insert the csv into a table in a sql server database, and then query it and return the data. But this data is going into a html table built on the fly, which would make the database approach a little long winded.

From stackoverflow
  • What version of .NET are you using? If you're using 3.5 - or can use C# 3.0 and LINQBridge - then I'd definitely go with LINQ. First transform each line into some appropriate object, and then use OrderBy:

    var cars = lines.Select(line => Car.ParseLine(line))
                    .OrderBy(car => car.Manufacturer);
    
  • Hi,

    I use .Net 3.5.

    The LINQ approach is good however I haven't really taught myself LINQ completely - yet. BTW, where does Car and car come from (the declarations)?

    When I sort the array, I need to have the sorted version of data in a collection (ideally array). So I sort the data into alphabetical order, and it'd be important to stick it back in an array so I can iterate through it (a pattern is to say for each line in csvcollectiondata, and then split each string).

    Car is an error btw.

    P.S. I read your blog fairly often.

    Jon Skeet : You'd need to write the Car class yourself - just the simple properties and a parser. The result of a LINQ query is enumerable, so you can still use foreach.
  • If you just have a bunch of strings in an Array, use

    Array.Sort(myArray)
    

    That will put the strings in "myArray" in alphabetical order (case sensitive).

    If you want to to different comparisons (like case-insensitive for example), you can define an own ICcomparer or use Linq-Extensions, preferrably with a lambda expression like

            string [] sArray = new string[] { "fsdhj", "FA", "FX", "fxx", "Äbc" };
            sArray = sArray.OrderBy(s => s.ToLowerInvariant()).ToArray();
    

    There's a whole bunch of other sorting methods, but these are the most basic. I could give you a more detailed answer to your problem, if I understood better what your input-object looks like. As long as it's just an array of strings, you should be fine with the above.

    In response to the first two comments below I should also note that the invariant string-sorting method given above is not the best one for that particular job (see comments).

    However it does illustrate the use of extension methods with lambda expressions, which come in very handy in situations where you don't have predefined IComparer-classes.

    Marc Gravell : Or more simply: Array.Sort(sArray, StringComparer.InvariantCultureIgnoreCase);
    Jon Skeet : Marc's solution is more correct as well as simpler. Case-insensitive sorting shouldn't be done by just converting to upper/lower case first. That has issues in some cultures (e.g. Turkish).
    TToni : I'll have to take your word for the issues in Turkish or other languages. It works quite well in German though. Anyway, thanks for the comment. I should have used a better example.
  • Without trying to second guess Jon, I believe he is suggesting that you would create a class Car ("some appropriate object") with the necessary properties, and the ability to populate a Car from a line:

    public class Car
    {
        public int Id {get;set;}
        public string Manufacturer {get;set;}
        public string Model {get;set;}
    
        public static Car ParseLine(string line)
        {
            string[] parts = line.Split(DELIMITER);
            return new Car
            {
                Id = int.Parse(parts[0]),
                Manufacturer = parts[1],
                Model = parts[2]
            };
        }
    }
    

    i.e. treating the lines as objects. Then with LINQ things become quite simple:

            var query = from line in lines
                        let car = Car.ParseLine(line)
                        orderby car.Manufacturer
                        select car;
    
            var arr = query.ToArray();
    

    Note that you can do this without LINQ too, for example (using a Car[] array) - an in-place array sort:

            Array.Sort(arr, (x, y) => string.Compare(x.Manufacturer, y.Manufacturer));
    

    or the same with List<Car>:

            list.Sort((x, y) => string.Compare(x.Manufacturer, y.Manufacturer));
    
    Jon Skeet : Feel free to second guess me any time. You pretty much always either get it right or manage to explain something that I wasn't *actually* thinking, but should have been :)
    VVS : You two should get paid by SO ;)
    Marc Gravell : @David - now *there's* a thought
  • In Jon's post he is parsing the line into objects first, then sorting. You could also just sort an IEnumerable list of strings based on part of the string. Here is an example that is sorting the list on a substring (first 10 characters starting at position 6):

    List<string> lines = new List<string>
        {
         "34949 Toyota Yaris",
         "328   Honda Civic",
         "28437 Ford Fiesta"
        };
    
    var sortedLines = lines.OrderBy(line => line.Substring(6, 10));
    // Or make it case insensitive
    // var sortedLines = lines.OrderBy(line => line.Substring(6, 10), StringComparer.InvariantCultureIgnoreCase);
    
    foreach (var line in sortedLines)
    {
        Console.WriteLine(line);
    }
    
  • Thanks guys.

    I will try all of these approaches.

    Seems like I better learn Linq religiously, as it will save me a lot of time!

    I'll let you know how I get on.

0 comments:

Post a Comment