Friday, April 8, 2011

Scala: splitting a list using a predicate for sublists

I just had a use case where I needed to split a list into n sublists, so that elements are taken in order from the original list and grouped while the predicate is true for the sublist (when it is false, a new sublist is started). I didn't find this functionality in the standard library, and I thought it was a good exercise to try to solve it in a functional style (as I'm far from a functional guru).

Below is the code I came up with. But I suspect it can be improved a lot. Can you help me find a better way to code this?

class ListWithSplitter[A](val theList:List[A])
{
  private def sublistWhile(list:List[A], pred:(List[A] => Boolean)):(List[A],List[A]) =
  {
    def combine(okList:List[A], remaining:List[A], pred:(List[A] => Boolean)):(List[A],List[A]) =
    {
      if(pred(okList ::: remaining.head :: Nil))
        combine(okList ::: remaining.head :: Nil, remaining.tail, pred)
      else
        (okList, remaining)
    }

    list match {
      case Nil => (Nil, Nil)
      case x :: Nil => (list, Nil)
      case x :: xs => combine(List(x), xs, pred)
    }
  }

  private def combinedSplit(list:List[A], pred:(List[A] => Boolean)):List[List[A]] =
  {
    val r = sublistWhile(list, pred)
    r match {
      case (Nil, Nil) => List(Nil)
      case (x, Nil) => List(x)
      case (x, y) => x :: combinedSplit(y, pred)
    }
  }

  def combinedSplit(pred:(List[A] => Boolean)):List[List[A]] =
  {
    combinedSplit(theList, pred)
  }
}

trait ListCombinedSplit
{
  implicit def list2combSplitter[A](x:List[A]) : ListWithSplitter[A] = new ListWithSplitter(x)
}

object ListSplitter extends ListCombinedSplit {

  def main(args:Array[String])
  {
    // sample usage: sum of each sublist is less than 100
    val a = List(4, 59, 10, 24, 42, 9, 2, 44, 44, 44, 44)
    val b = a combinedSplit { list:List[Int] => ((0 /: list)(_ + _)) < 100 }

    b foreach println
  }
}

Result of sample is:

List(4, 59, 10, 24)
List(42, 9, 2, 44)
List(44, 44)
List(44)
From stackoverflow
  • Your code has the problem that mere invocations of your predicate make it O(N^2). Also, I think the object oriented stuff is too complicated.

    I came up with:

    scala> def lsplit(x : List[Int], limit : Int) =
      (x foldLeft (0, Nil.asInstanceOf[List[List[Int]]])) ((x, y) => x match {
        case (v, l::xs) if v+y < limit => (v+y, (y::l)::xs)
        case (_, xs) => (y, (y::Nil)::xs)
      })._2.reverse.map(x => x.reverse)
    
    lsplit: (List[Int],Int)List[List[Int]]
    
    scala> lsplit(List.range(1,50), 100)
    res9: List[List[Int]] = List(List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13), List(14, 15, 16, 17, 18, 19), List(20, 21, 22, 23), List(24, 25, 26), List(27, 28, 29), List(30, 31, 32), List(33, 34), List(35, 36), List(37, 38), List(39, 40), List(41, 42), List(43, 44), List(45, 46), List(47, 48), List(49))
    
    scala> lsplit(List.range(1,50), 122)
    res10: List[List[Int]] = List(List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15), List(16, 17, 18, 19, 20, 21), List(22, 23, 24, 25, 26), List(27, 28, 29, 30), List(31, 32, 33), List(34, 35, 36), List(37, 38, 39), List(40, 41), List(42, 43), List(44, 45), List(46, 47), List(48, 49))
    

    This doesn't allow you to specify arbitrary predicate, but you can make it work for fold-like predicates by changing manipulation with the first elent of the pair in foldLeft.

    Germán : Thanks... not as generic as I intended but will do for my use case!

0 comments:

Post a Comment