KEMBAR78
Native XML processing in C++ (BoostCon'11) | PDF
LEESA: Toward Native XML Processing Using Multi-paradigm Design
                            in C++




                           May 16, 2011

  Dr. Sumant Tambe                        Dr. Aniruddha Gokhale
     Software Engineer               Associate Professor of EECS Dept.
   Real-Time Innovations                   Vanderbilt University




                  www.dre.vanderbilt.edu/LEESA
                                                                         1 / 54
 XML Programming in C++. Specifically, data binding
 What XML data binding stole from us!
 Restoring order: LEESA
 LEESA by examples
 LEESA in detail
      Architecture of LEESA
      Type-driven data access
      XML schema representation using Boost.MPL
      LEESA descendant axis and strategic programming
      Compile-time schema conformance checking
      LEESA expression templates
 Evaluation: productivity, performance, compilers
 C++0x and LEESA
 LEESA in future
                                                         2 / 54
XML Infoset




              Cɷ
                   3 / 54
 Type system
    Regular types
    Anonymous complex elements
    Repeating subsequence
 XML data model
    XML information set (infoset)
    E.g., Elements, attributes, text, comments, processing
     instructions, namespaces, etc. etc.
 Schema languages
    XSD, DTD, RELAX NG
 Programming Languages
    XPath, XQuery, XSLT
 Idioms and best practices
    XPath: Child, parent, sibling, descendant axes;
     wildcards

                                                              4 / 54
 Predominant categories & examples (non-exhaustive)
 DOM API
    Apache Xerces-C++, RapidXML, Tinyxml, Libxml2, PugiXML, lxml,
     Arabica, MSXML, and many more …
 Event-driven APIs (SAX and SAX-like)
    Apache SAX API for C++, Expat, Arabica, MSXML, CodeSynthesis
     XSD/e, and many more …
 XML data binding
    Liquid XML Studio, Code Synthesis XSD, Codalogic LMX, xmlplus,
     OSS XSD, XBinder, and many more …
 Boost XML??
    No XML library in Boost (as of May 16, 2011)
    Issues: very broad requirements, large XML specifications, good XML
     libraries exist already, encoding issues, round tripping issues, and
     more …

                                                                        5 / 54
XML query/traversal
                                       program

    XML




                                               Uses
   Schema


                       XML
                     Schema                                        C++
                                        Object-oriented
              i/p    Compiler Generate Data Access Layer   i/p              Generate   Executable
                                                                 Compiler
                      (Code
                    Generator)

                                           C++ Code


 Process
      Automatically generate vocabulary-specific classes from the schema
      Develop application code using generated classes
      Parse an XML into an object model at run-time
      Manipulate the objects directly (CRUD)
      Serialize the objects back to XML
                                                                                              6 / 54
 Example: Book catalog xml and xsd

<catalog>                                       <xs:complexType name=“book”>
  <book>                                         <xs:sequence>
    <name>The C++ Programming Language</name>      <xs:element name="name" type="xs:string" />
    <price>71.94</price>                           <xs:element name="price" type="xs:double" />
                                                   <xs:element name="author" maxOccurs="unbounded">
    <author>
                                                    <xs:complexType>
      <name>Bjarne Stroustrup</name>
                                                     <xs:sequence>
      <country>USA</country>                          <xs:element name="name" type="xs:string" />
    </author>                                         <xs:element name="country" type="xs:string" />
  </book>                                            </xs:sequence>
  <book>                                            </xs:complexType>
    <name>C++ Coding Standards</name>             </xs:element>
    <price>36.41</price>                         </xs:sequence>
    <author>                                    </xs:complexType>
      <name>Herb Sutter</name>
      <country>USA</country>                    <xs:element name="catalog">
                                                 <xs:complexType>
    </author>
                                                  <xs:sequence>
    <author>
                                                    <xs:element name=“book”
      <name>Andrei Alexandrescu</name>                           type=“lib:book"
      <country>USA</country>                                     maxOccurs="unbounded">
    </author>                                       </xs:element>
  </book>                                         </xs:sequence>
</catalog>                                       </xs:complexType>
                                                </xs:element>




                                                                                                7 / 54
 Example: Book catalog xsd and generated C++ code
<xs:complexType name=“book”>              class author {
 <xs:sequence>                               private:
   <xs:element name="name"                            std::string name_;
                type="xs:string" />                   std::string country_;
   <xs:element name="price"
                                             public:
                type="xs:double" />
                                                      std::string get_name() const;
   <xs:element name="author"
                maxOccurs="unbounded">                void set_name(std::string const &);
    <xs:complexType>                                  std::string get_country() const;
     <xs:sequence>                                    void set_country(std::string const &);
      <xs:element name="name"             };
                   type="xs:string" />
      <xs:element name="country"          class book {
                   type="xs:string" />       private: std::string name_;
     </xs:sequence>                                   double price_;
    </xs:complexType>                                 std::vector<author> author_sequence_;
  </xs:element>
                                             public: std::string get_name() const;
 </xs:sequence>
                                                      void set_name(std::string const &);
</xs:complexType>
                                                      double get_price() const;
<xs:element name="catalog">                           void set_price(double);
 <xs:complexType>                                     std::vector<author> get_author() const;
  <xs:sequence>                                       void set_author(vector<author> const &);
    <xs:element name=“book”               };
                 type=“lib:book"          class catalog {
                 maxOccurs="unbounded">      private:
    </xs:element>                                     std::vector<book> book_sequence_;
  </xs:sequence>                             public:
 </xs:complexType>
                                                      std::vector<book> get_book() const;
</xs:element>
                                                      void set_book(std::vector<book> const &);
                                          };
                                                                                             8 / 54
 Book catalog application program
   Example: Find all author names
  std::vector<std::string>
  get_author_names (const catalog & root)
  {
    std::vector<std::string> name_seq;
    for (catalog::book_const_iterator bi (root.get_book().begin ());
         bi != root.get_book().end ();
         ++bi)
    {
      for (book::author_const_iterator ai (bi->get_author().begin ());
           ai != bi->get_author().end ();
           ++ai)
      {
        name_seq.push_back(ai->name());
      }
    }
    return name_seq;
  }


 Advantages of XML data binding
   Easy to use                        C++ programming style and idioms
   Vocabulary-specific API            Efficient
   Type safety
                                                                           9 / 54
 We lost something along the way. A lot actually!
 Loss of succinctness
    XML child axis replaced by nested for loops
    Example: Find all author names

                                Using XML data binding (20 lines)
   Using XPath (1 line)
/book/author/name/text()      std::vector<std::string>
                              get_author_names (const catalog & root)
                              {
                                std::vector<std::string> name_seq;
                                for (catalog::book_const_iterator bi =
                                       root.get_book().begin ();
                                     bi != root.get_book().end ();
                                     ++bi)
                                {
                                  for (book::author_const_iterator ai =
                                         bi->get_author().begin ());
                                       ai != bi->get_author().end ();
                                       ++ai)
                                  {
                                    name_seq.push_back(ai->name());
                                  }
                                }
                                return name_seq;
                              }
                                                                          10 / 54
 Loss of expressive power
     Example: “Find all names recursively”
     What if catalogs are recursive too!
     Descendant axis replaced by manual recursion. Hard to maintain.

    Using XPath (1 line)                  Using XML data binding using
//name/text()                             BOOST_FOREACH (20+ lines)
                                        std::vector<std::string> get_author_names (const catalog & c)
<catalog>                               {
  <catalog>                               std::vector<std::string> name_seq;
    <catalog>                             BOOST_FOREACH(const book &b, c.get_book())
      <catalog>                           {
        <book><name>...</name></book>       BOOST_FOREACH(const author &a, b.get_author())
        <book><name>...</name></book>       {
      </catalog>                              name_seq.push_back(a.name());
      <book>...</book>                      }
                                          }
      <book>...</book>
                                          return name_seq;
    </catalog>                          }
    <book>
      <name>...</name>                  std::vector<std::string> get_all_names (const catalog & root)
      <price>...</price>                {
      <author>                            std::vector<std::string> name_seq(get_author_names(root));
        <name>...</name>                  BOOST_FOREACH (const catalog &c, root.get_catalog())
        <country>...</country>            {
      </author>                             std::vector<std::string> names = get_all_names(c);
                                            name_seq.insert(names.begin(), names.end());
    </book>
                                          }
    <book>...</book>                      return name_seq;
    <book>...</book>                    }
  </catalog>
</catalog>
                                                                                                   11 / 54
 Loss of XML programming idioms
   Cannot use “wildcard” types
   Example: Without spelling “Catalog” and “Book”, find names that are
    exactly at the third level.
    Using XPath (1 line)                         Using XML data binding
/*/*/name/text()                      std::vector<std::string>
                                      get_author_names (const catalog & root)
                                      {
                                        std::vector<std::string> name_seq;
                                        . . .
                                        . . .

                                          return name_seq;
                                      }


 Also known as structure-shyness
    Descendant axis and wildcards don’t spell out every detail of the
     structure
 Casting Catalog to Object class isn’t good enough
    object.get_book()  compiler error!
    object.get_children()  Inevitable casting!
                                                                                12 / 54
 Hybrid approach: Pass XPath expression as a string
      Using XML data binding + XPath                         No universal support
                                                             Boilerplate setup code
DOMElement* root (static_cast<DOMElement*> (c._node ()));
DOMDocument* doc (root->getOwnerDocument ());


                                                                 DOM, XML namespaces,
dom::auto_ptr<DOMXPathExpression> expr (
  doc->createExpression (
    xml::string ("//author").c_str (),
    resolver.get ()));                                            Memory management
dom::auto_ptr<DOMXPathResult> r (
  expr->evaluate (
                                                             Casting is inevitable
                                                             Look and feel of two
    doc, DOMXPathResult::ITERATOR_RESULT_TYPE, 0));



                                                              APIs is (vastly) different
while (r->iterateNext ())
{
  DOMNode* n (r->getNodeValue ());

    author* a (                                                  iterateNext() Vs.
      static_cast<author*> (
        n->getUserData (dom::tree_node_key)));                    begin()/end()
}
    cout << "Name   : " << a->get_name () << endl;
                                                             Can’t use predicates on
                                                              data outside xml
                                                                 E.g. Find authors of highest
                                                                  selling books
                                                            “/book[?condition?]/author/name”
                                                                                           13 / 54
 Schema-specificity (to much object-oriented bias?)
    Each class has a different interface (not generic)
    Naming convention of XML data binding tools vary

              Catalog           Book            Author


           +get_Book()      +get_Author()    +get_Name()
                            +get_Price()     +get_Country()
                            +get_name()


 Lost succinctness (axis-oriented expressions)
 Lost structure-shyness (descendant axis, wildcards)
 Can’t use Visitor design pattern (stateful traversal) with
  XPath



                                                              14 / 54
Language for Embedded QuEry and TraverSAl




      Multi-paradigm Design in C++
                                            15 / 54
*


                      Catalog                       A book catalog xsd
                   +get_Book()
                   +get_Catalog()
                                    1               Generated six C++ classes
                      1
                      *                                 Catalog
           1   1
                          Book                          Book      Complex classes
  Price

                   +get_Author()                        Author
                   +get_Price()     1
                   +get_Name()              Name
                                                        Price
                                                                   Simple classes
                                        1
                      1
                      *
                                        1
                                                        Country
 Country   1   1      Author                            Name
                   +get_Name()
                   +get_Country()
                                    1               Price, Country, and Name
<catalog>
  <catalog>                                          are simple wrappers
                                                    Catalogs are recursive
    <catalog>
      <catalog>...</catalog>
    </catalog>
    <book>
      <name>...</name>
      <price>...</price>
      <author>
        <name>...</name>
        <country>...</country>
      </author>
    </book>
  </catalog>
</catalog>
                                                                                     16 / 54
*




 Restoring succinctness
                                                                 Catalog


                                                              +get_Book()
                                                                               1

    Example: Find all author names
                                                              +get_Catalog()

                                                                 1

    Child axis traversal
                                                                 *

                                                                     Book
                                             Price    1   1

                                                              +get_Author()
                                                              +get_Price()     1
                                                              +get_Name()              Name

                                                                                   1
                                                                 1

 Using XPath (1 line)
                                                                 *
                                                                                   1

                                                      1   1      Author
                                            Country
 /book/author/name/text()
                                                              +get_Name()
                                                                               1
                                                              +get_Country()




Using LEESA (3 lines)
 Catalog croot = load_catalog(“catalog.xml”);
 std::vector<Name> author_names =
 evaluate(croot, Catalog() >> Book() >> Author() >> Name());




                                                                                          17 / 54
*




 Restoring expressive power
                                                                  Catalog


                                                               +get_Book()
                                                                                1

     Example: Find all names recursively
                                                               +get_Catalog()

                                                                  1

     Descendant axis traversal
                                                                  *

                                                                      Book
                                             Price     1   1

                                                               +get_Author()
                                                               +get_Price()     1
                                                               +get_Name()              Name

                                                                                    1
                                                                  1

Using XPath (1 line)
                                                                  *
                                                                                    1

                                                       1   1      Author
                                             Country
//name/text()
                                                               +get_Name()
                                                                                1
                                                               +get_Country()



Using LEESA (2 lines)
Catalog croot = load_catalog(“catalog.xml”);
std::vector<Name> names = DescendantsOf(Catalog(), Name())(croot);



 Fully statically typed execution
 Efficient: LEESA “knows” where Names are!
                                                                                           18 / 54
 Restoring xml programming
                                                                    *


                                                                    Catalog


  idioms (structure-shyness)                                     +get_Book()
                                                                 +get_Catalog()
                                                                                  1


     Example: Without spelling intermediate                        1

      types, find names that are exactly at
                                                                    *

                                                                        Book

      the third level.                         Price     1   1

                                                                 +get_Author()

     Wildcards in a typed query!
                                                                 +get_Price()     1
                                                                 +get_Name()              Name

                                                                                      1
                                                                    1

Using XPath (1 line)
                                                                    *
                                                                                      1

                                                         1   1      Author
                                               Country
/*/*/name/text()                                                 +get_Name()
                                                                                  1
                                                                 +get_Country()



Using LEESA (3 lines)
namespace LEESA { struct Underbar {} _; }
Catalog croot = load_catalog(“catalog.xml”);
std::vector<Name> names =
    LevelDescendantsOf(Catalog(), _, _, Name())(croot);

 Fully statically typed execution
 Efficient: LEESA “knows” where Books, Authors, and
  Names are!                                                                                 19 / 54
*


 User-defined filters                                                   Catalog


      Example: Find names of authors from                            +get_Book()
                                                                      +get_Catalog()
                                                                                       1

       Country == USA                                                    1
                                                                         *

      Basically unary functors                                              Book
                                                              1   1

      Supports free functions, function
                                                    Price

                                                                      +get_Author()

       objects, boost::bind, C++0x lambda
                                                                      +get_Price()     1
                                                                      +get_Name()              Name

                                                                                           1
                                                                         1
                                                                         *
                                                                                           1

                                                              1   1      Author
                                                    Country

                                                                      +get_Name()
                                                                                       1

Using XPath (1 line)
                                                                      +get_Country()




//author[country/text() = ‘USA’]/name/text()


Using LEESA (6 lines)
Catalog croot = load_catalog(“catalog.xml”);
std::vector<Name> author_names = evaluate(croot,
      Catalog()
   >> DescendantsOf(Catalog(), Author())
   >> Select(Author(), [](const Author &a) { return a.get_Country() == “USA"; })
   >> Name());

                                                                                                  20 / 54
*


 Tuplefication!!                                                        Catalog


      Example: Pair the name and country of                          +get_Book()
                                                                      +get_Catalog()
                                                                                       1

       all the authors                                                   1
                                                                         *

      std::vector of                                                        Book
                                                    Price     1   1
       boost::tuple<Name *, Country *>                                +get_Author()
                                                                      +get_Price()     1
                                                                      +get_Name()              Name

                                                                                           1
                                                                         1
                                                                         *
                                                                                           1

                                                              1   1      Author
                                                    Country

                                                                      +get_Name()
                                                                                       1

Using XPath
                                                                      +get_Country()




???????????????????????????????


Using LEESA (5 lines)
Catalog croot = load_catalog(“catalog.xml”);
std::vector<boost::tuple<Name *, Country *> > tuples =
evaluate(croot, Catalog()
             >> DescendantsOf(Catalog(), Author())
             >> MembersAsTupleOf(Author(), make_tuple(Name(), Country())));


                                                                                                  21 / 54
*


 Using visitors
                                                                                                   MyVisitor
                                                                     Catalog
                                                                                                +visit_Catalog()

    Gang-of-four Visitor design pattern
                                                                                                +visit_Book()
                                                                  +get_Book()
                                                                                   1            +visit_Author()
                                                                  +get_Catalog()
                                                                                                +visit_Name()

    Visit methods for all Elements
                                                                                                +visit_Country()
                                                                     1                          +visit_Price()
                                                                     *


    Example: Visit catalog, books, authors,   Price      1   1
                                                                         Book


     and names in that order                                      +get_Author()
                                                                  +get_Price()     1

    Stateful, statically typed traversal
                                                                  +get_Name()                    Name

                                                                                            1
                                                                     1

    fixed depth child axis
                                                                     *
                                                                                        1

                                                          1   1      Author
                                               Country

                                                                  +get_Name()
                                                                                   1

Using XPath
                                                                  +get_Country()




???????????????????????????????                                      Catalog


Using LEESA (7 lines)                                         Book1                Book2
Catalog croot = load_catalog(“catalog.xml”);
MyVisitor visitor;
std::vector<Country> countries =                         A1        A2                  A3        A4
evaluate(croot,     Catalog() >> visitor
                 >> Book()    >> visitor
                 >> Author() >> visitor                  C1                                       C4
                                                                   C2                  C3
                 >> Country() >> visitor);
                                                                                                             22 / 54
*


 Using visitors (depth-first)
                                                                                                           MyVisitor
                                                                                 Catalog
                                                                                                        +visit_Catalog()

    Gang-of-four Visitor design pattern
                                                                                                        +visit_Book()
                                                                              +get_Book()
                                                                                               1        +visit_Author()
                                                                              +get_Catalog()
                                                                                                        +visit_Name()

    Visit methods for all Elements
                                                                                                        +visit_Country()
                                                                                 1                      +visit_Price()
                                                                                 *


    Example: Visit catalog, books, authors,           Price      1       1
                                                                                     Book


     and names in depth-first manner                                          +get_Author()
                                                                              +get_Price()     1

    Stateful, statically typed traversal                                     +get_Name()                Name

                                                                                                    1

    fixed depth child axis
                                                                                 1
                                                                                 *
                                                                                                    1

                                                                      1   1      Author
                                                       Country

                                                                              +get_Name()

Using XPath
                                                                                               1
                                                                              +get_Country()



???????????????????????????????                                                 Catalog
                               Default precedence.
Using LEESA (7 lines)         No parenthesis needed.
                                                                      Book1                        Book2
Catalog croot = load_catalog(“catalog.xml”);
MyVisitor visitor;
std::vector<Book> books =
evaluate(croot,      Catalog() >> visitor                        A1           A2               A3       A4
                 >>= Book()    >> visitor
                 >>= Author() >> visitor
                 >>= Country() >> visitor);                      C1           C2               C3       C4
                                                                                                                    23 / 54
Visited




   Child Axis             Child Axis                        Parent Axis
                                           Parent Axis
 (breadth-first)         (depth-first)                     (depth-first)
                                         (breadth-first)

Catalog() >> Book() >> v >> Author() >> v

Catalog() >>= Book() >> v >>= Author() >> v
                                                               Default
                                                             precedence.
Name() << v << Author() << v << Book() << v                 No parenthesis
                                                               needed.

Name() << v <<= Author() << v <<= Book() << v

                                                                         24 / 54
*


 Composing named queries
                                                                                              MyVisitor
                                                                     Catalog
                                                                                           +visit_Catalog()

      Queries can be named, composed, and                        +get_Book()
                                                                  +get_Catalog()
                                                                                   1
                                                                                           +visit_Book()
                                                                                           +visit_Author()

       passed around as executable
                                                                                           +visit_Name()
                                                                                           +visit_Country()
                                                                     1

       expressions
                                                                     *                     +visit_Price()


                                                                         Book

      Example:                                  Price    1   1

                                                                  +get_Author()

       For each book                                              +get_Price()
                                                                  +get_Name()
                                                                                   1
                                                                                            Name


            print(country of the author)
                                                                                       1
                                                                     1
                                                                     *
                                                                                       1

            print(price of the book)            Country   1   1      Author


                                                                  +get_Name()

Using XPath
                                                                                   1
                                                                  +get_Country()



???????????????????????????????


Using LEESA (6 lines)
Catalog croot = load_catalog(“catalog.xml”);
MyVisitor visitor;
BOOST_AUTO(v_country, Author() >> Country() >> visitor);
BOOST_AUTO(v_price,   Price() >> visitor);
BOOST_AUTO(members, MembersOf(Book(), v_country, v_price));
evaluate(croot, Catalog() >>= Book() >> members);

                                                                                                       25 / 54
 Using visitors (recursively)
       Hierarchical Visitor design pattern
       Visit and Leave methods for all elements
       Depth awareness
       Example: Visit everything!!
       Stateful, statically typed traversal
       Descendant axis = recursive
       AroundFullTD = AroundFullTopDown

Using XPath
???????????????????????????????

Using LEESA (3 lines!!)
Catalog croot = load_catalog(“catalog.xml”);
MyHierarchicalVisitor v;
AroundFullTD(Catalog(), VisitStrategy(v), LeaveStrategy(v)))(croot);



                                                                       26 / 54
 LEESA
  1.   Is not an xml parsing library      XML data binding tool
  2.   Does not validate xml files            can do both
  3.   Does not replace/compete with XPath
  4.   Does not resolve X/O impedance mismatch
         More reading: “Revealing X/O impedance mismatch”, Dr. R Lämmel


 LEESA
  1.   Is a query and traversal library for C++
  2.   Validates XPath-like queries at compile-time (schema conformance)
  3.   Is motivated by XPath
  4.   Goes beyond XPath
  5.   Simplifies typed XML programming
  6.   Is an embedded DSEL (Domain-specific embedded language)
  7.   Is applicable beyond xml (E.g., Google Protocol Buffers, model
       traversal, hand coded class hierarchies, etc.)
                                                                       27 / 54
 XML Programming in C++, specifically data-binding
 What XML data binding stole from us!
 Restoring order: LEESA
 LEESA by examples
 LEESA in detail
      Architecture of LEESA
      Type-driven data access
      XML schema representation using Boost.MPL
      LEESA descendant axis and strategic programming
      Compile-time schema conformance checking
      LEESA expression templates
 Evaluation: productivity, performance, compilers
 C++0x and LEESA
 LEESA in future
                                                         28 / 54
 The Process

                LEESA Expressions Written by Programmers
                      Axes-oriented                    Recursive Traversal
                  Traversal Expressions              (Strategic Programming)
                              Ch
                                ec
                                   ke
                                      d                         es
                                          ag
                                            ai                Us
                                               ns
                                                 t
  XML




                                                                               i/p
 Schema
                                                Static
                          Generate
                                           meta-information          i/p

                Extended
                 Schema
                                      Type-driven
          i/p    Compiler Generate Data Access Layer                         C++
                                                                   i/p                Generate   Executable
                  (Code                                                    Compiler
                Generator)
                                           Object-oriented
                           Generate
                                          Data Access Layer          i/p

                                                 C++ Code



                                                                                                       29 / 54
XML
                 Schema                 Static      Type-driven
                                                                       Visitor
                                        meta-       Data Access
                                                                     Declarations
                                     information       Layer
                                                    C++ (.h, .cpp)


                 Object-oriented                                         Meta-data
      Schema                                 XML                ALL
                  Data Access      Doxygen   XML
                                             XML
                                              XML     XSLT
                                                                         Generator
      Compiler                                                  XML
                     Layer
                    C++ (.h)
                                             LEESA’s gen-meta.py script

 Extended schema compiler = 4 step process
   XML schema language (XSD) specification is huge and complex
   Don’t reinvent the wheel: xml data binding tools already process it
   Naming convention of xml data binding tools vary
   Applicability beyond xml data binding
       E.g. Google Protocol Buffers (GPB), hand written class hierarchies
   Meta-data generator script inserts visitor declaration in the C++
    classes                                                               30 / 54
   To fix  Different interface of each class
     Generic API “children” wrappers to navigate aggregation
     Generated by the Python script
     More amenable to composition

std::vector<Book> children (Catalog &c, Book const *) {
  return c.get_Book();
}
std::vector<Catalog> children (Catalog &c, Catalog const *) {
  return c.get_Catalog();
}
std::vector<Author> children (Book &b, Author const *) {
  return b.get_Author();
}
Price children (Book &b, Price const *) {
  return b.get_Price();
}
Name children (Book &b, Name const *) {
  return b.get_Name();
}
Country children (Author &a, Country const *) {
  return a.get_Country();
}
Name children (Author &a, Name const *) {
  return a.get_Name();
}
                                                                31 / 54
 Ambiguity!
     Simple elements and attributes are mapped to built-in types
     “children” function overloads become ambiguous
<xs:complexType name=“Author”>
  <xs:sequence>
    <xs:element   name=“first_name" type="xs:string" />       Mapping
    <xs:element   name=“last_name“ type="xs:string" />
  </xs:sequence>
</xs:complexType>



                                                                    gen-meta.py


                                   std::string children (Author   &a, std::string const *) {
                                     return a.get_first_name();
                                   }
                                   std::string children (Author   &a, std::string const *) {
                                     return a.get_last_name();
                                   }




                                                                                          32 / 54
 Solution 1: Automatic schema transformation
     Force data binding tools to generate unique C++ types
     gen-meta.py can transforms input xsd while preserving semantics
<xs:complexType name=“Author”>
  <xs:sequence>
    <xs:element   name=“first_name" type="xs:string" />   Mapping
    <xs:element   name=“last_name“ type="xs:string" />
  </xs:sequence>
</xs:complexType>

                             Transformation
                             (gen-meta.py)
<xs:complexType name=“Author”>
 <xs:sequence>
  <xsd:element name=“first_name">                         Mapping
   <xsd:simpleType>
     <xsd:restriction base="xsd:string" />
   </xsd:simpleType>
  </xsd:element>
  <xsd:element name=“last_name">
   <xsd:simpleType>
     <xsd:restriction base="xsd:string" />
   </xsd:simpleType>
  </xsd:element>
 </xs:sequence>
</xs:complexType>
                                                                        33 / 54
 Solution 1 limitations: Too many types! Longer compilation times.
 Solution 2: Generate placeholder types
    Create unique type aliases using a template and integer literals
    Not implemented!
             <xs:complexType name=“Author”>
               <xs:sequence>
                 <xs:element   name=“first_name" type="xs:string" />
                 <xs:element   name=“last_name“ type="xs:string" />
               </xs:sequence>
             </xs:complexType>

                                          Code generation
                                          (gen-meta.py)
             namespace LEESA {
               template <class T, unsigned int I>
               struct unique_type
               {
                  typedef T nested;
               };
             }
             namespace Library {
               typedef LEESA::unique_type<std::string, 1> first_name;
               typedef LEESA::unique_type<std::string, 2> last_name;
             }
                                                                        34 / 54
 A key idea in LEESA
      Externalize structural meta-information using Boost.MPL
      LEESA’s meta-programs traverse the meta-information at compile-time

                                                   template <class Kind>
                      *                            struct SchemaTraits
                                                   {
                      Catalog                         typedef mpl::vector<> Children; // Empty sequence
                                                   };
                   +get_Book()
                                    1
                   +get_Catalog()
                                                   template <>
                      1                            struct SchemaTraits <Catalog>
                      *
                                                   {
                          Book                        typedef mpl::vector<Book, Catalog> Children;
 Price     1   1
                                                   };
                   +get_Author()                   template <>
                   +get_Price()     1
                   +get_Name()              Name   struct SchemaTraits <Book>
                                        1
                                                   {
                      1                               typedef mpl::vector<Name, Price, Author> Children;
                      *
                                        1          };
 Country   1   1      Author                       template <>
                                                   struct SchemaTraits <Author>
                   +get_Name()                     {
                                    1
                   +get_Country()
                                                      typedef mpl::vector<Name, Country> Children;
                                                   };




                                                                                                           35 / 54
 A key idea in LEESA
      Externalize structural meta-information using Boost.MPL
      Descendant meta-information is a transitive closure of Children
                                                   template <class Kind> struct SchemaTraits {
                                                      typedef mpl::vector<> Children; // Empty sequence
                      *                            };
                                                   template <> struct SchemaTraits <Catalog> {
                      Catalog
                                                      typedef mpl::vector<Book, Catalog> Children;
                                                   };
                   +get_Book()
                   +get_Catalog()
                                    1              template <> struct SchemaTraits <Book> {
                                                      typedef mpl::vector<Name, Price, Author> Children;
                      1                            };
                      *
                                                   template <> struct SchemaTraits <Author> {
                          Book                        typedef mpl::vector<Name, Country> Children;
 Price     1   1
                                                   };
                   +get_Author()                   typedef boost::mpl::true_ True;
                   +get_Price()     1
                   +get_Name()              Name   typedef boost::mpl::false_ False;
                                        1
                                                   template<class A, class D> struct IsDescendant : False     {};
                      1                            template<> struct IsDescendant<Catalog, Catalog> : True    {};
                      *
                                        1          template<> struct IsDescendant<Catalog, Book>     : True   {};
           1   1      Author                       template<> struct IsDescendant<Catalog, Name>     : True   {};
 Country
                                                   template<> struct IsDescendant<Catalog, Price>    : True   {};
                   +get_Name()                     template<> struct IsDescendant<Catalog, Author> : True     {};
                                    1
                   +get_Country()
                                                   template<> struct IsDescendant<Catalog, Country> : True    {};
                                                   template<> struct IsDescendant<Book, Name>        : True   {};
                                                   template<> struct IsDescendant<Book, Price>       : True   {};
                                                   template<> struct IsDescendant<Book, Author>      : True   {};
                                                   template<> struct IsDescendant<Book, Country>     : True   {};
                                                   template<> struct IsDescendant<Author, Name>      : True   {};
                                                   template<> struct IsDescendant<Author, Country> : True     {};
                                                                                                              36 / 54
std::vector<Country> countries = DescendantsOf(Catalog(), Country())(croot);

 Algorithm (conceptual)
1. IsDescendant<Catalog, Country>::value
                                                                 Catalog
2. Find all children types of Catalog
    SchemaTraits<Catalog>::Children =
    boost::mpl::vector<Book, Catalog>
3. Iterate over Boost.MPL vector                          Book         Catalog
4. IsDescendant<Book, Country>::value
5. Use type-driven data access on each Catalog
    std::vector<Book>=children(Catalog&, Book*)
                                                   Name      Author          Price
    For Catalogs repeat step (1)
6. Find all children types of Book
    SchemaTraits<Book>::Children =
    boost::mpl::vector<Name, Author, Price>          Country          Name
7. Iterate over Boost.MPL vector
8. IsDescendant<Name, Country>::value
9. IsDescendant<Price, Country>::value
10. IsDescendant<Author, Country>::value
11. Use type drive data access on each Book
    std::vector<Author>=children(Book&, Author*)
12. Find all children types of Author
    SchemaTraits<Author>::Children =
    boost::mpl::vector<Country, Name>
13. Repeat until Country objects are found                                      37 / 54
 Strategic Programming Paradigm
   A systematic way of creating recursive tree traversal
   Developed in 1998 as a term rewriting language: Stratego
 Why LEESA uses strategic programming
   Generic
       LEESA can   be designed without knowing the types in a xml tree
   Recursive
       LEESA can   handles mutually and/or self recursive types
   Reusable
       LEESA can   be reused as a library for any xsd
   Composable
       LEESA can   be extended by its users using policy-based templates
 Basic combinators
   Identity, Fail, Sequence, Choice, All, and One



                                                                          38 / 54
fullTD(node)                  fullTD(node)            All(node, strategy)
{                             {                       {
  visit(node);                  visit(node);            forall children c of node
  forall children c of node     All(node, fullTD);        strategy(c);
       fullTD(c);             }                       }
}

 Pre-order traversal
    pseudo-code
   (fullTopDown)
                              fullTD(node)
                              {                                     Recursive
                                seq(node, visit, All(fullTD));       traversal
                                                                 (1 out of many)
                              }

                              seq(node,strategy1,strategy2)
                              {
                                strategy1(node);
                                strategy2(node);
                              }
                                                                       Basic
                              All(node, strategy)                  Combinators
                              {                                     (2 out of 6)
                                forall children c of node
                                  strategy(c);
                              }
                                                                              39 / 54
template <class Strategy1,             template <class Strategy>
           class Strategy2>            class All                                  Boost.MPL
class Seq                              {                                       Meta-information
{                                         template <class Data>
   template <class Data>                  void operator()(Data d)
   void operator()(Data d)                {
   {                                        foreach T in SchemaTraits<Data>::Children
     Strategy1(d);                             std::vector<T> t = children(d, (T *)0);
     Strategy2(d);                             Strategy(t);
   }                                     }
                                                                               Type-driven
};                                     };
                                                                                Data Access




        Sequence + All = FullTD

 template <class Strategy>
 class FullTD
 {
    template <class data>
    void operator()(Data d)
    {
      Seq<Strategy,All<FullTD>>(d);
    }
 };

Note: Objects and constructors omitted for brevity                                            40 / 54
*
 BOOST_AUTO(prices, DescendantsOf(Catalog(), Price()));
                                                                               Catalog

  LEESA uses FullTopDown<Accumulator<Price>>                               +get_Book()
                                                                                               1

  But schema unaware recursion in every sub-structure
                                                                            +get_Catalog()



   is inefficient
                                                                               1
                                                                               *



  We know that Authors do not contain Prices
                                                                                   Book
                                                          Price    1   1

                                                                            +get_Author()

                                 LEESA’s
                                                                            +get_Price()
                                                                            +get_Name()

FullTD may be                 schema-aware                                     1

  inefficient              traversal is optimal                                *


                                                                   1   1       Author
                                                         Country

                                                                            +get_Name()
                                                                            +get_Country()



                                                                            IsDescendant
                                                                           <Catalog,Price>
                                                                               = True




                                                                    IsDescendant
                                                                   <Author,Price>
                                                                      = False

                                     Bypass unnecessary
                      sub-structures (Author) using meta-programming                      41 / 54
 LEESA has compile-time schema conformance checking
     LEESA queries compile only if they agree with the schema
     Uses externalized schema and meta-programming
     Error message using BOOST_MPL_ASSERT
     Tries to reduce long and incomprehensible error messages
     Shows assertion failures in terms of concepts
         ParentChildConcept, DescendantKindConcept, etc.
         Originally developed for C++0x concepts

   Examples      DescendantKindConcept
                         Failure
                                                      ParentChildConcept
                                                            Failure


 1. BOOST_AUTO(prices, DescendantsOf(Author(), Price());

 2. BOOST_AUTO(books, Catalog() >> Book() >> Book());

 3. BOOST_AUTO(countries, LevelDescendantsOf(Catalog(),_,Country());


                                              LevelDescendantKindConcept
                                                        Failure
                                                                           42 / 54
 Country is at least 2 “steps” away from a Catalog
LevelDescendantsOf(Catalog(),_,Country());
1>------ Build started: Project: library, Configuration: Release Win32 ------
1> driver.cxx
1> using native typeof
1>C:mySVNLEESAincludeLEESA/SP_Accumulation.cpp(112): error C2664: 'boost::mpl::assertion_failed' : cannot convert
parameter 1 from 'boost::mpl::failed
************LEESA::LevelDescendantKindConcept<ParentKind,DescendantKind,SkipCount,Custom>::* ***********' to
'boost::mpl::assert<false>::type'
1>          with
1>          [
1>              ParentKind=library::Catalog,
1>              DescendantKind=library::Country,
1>              SkipCount=1,
1>              Custom=LEESA::Default
1>          ]
1>          No constructor could take the source type, or constructor overload resolution was ambiguous
1>          driver.cxx(155) : see reference to class template instantiation
'LEESA::LevelDescendantsOp<Ancestor,Descendant,SkipCount,Custom>' being compiled
1>          with
1>          [
1>              Ancestor=LEESA::Carrier<library::Catalog>,
1>              Descendant=LEESA::Carrier<library::Country>,
1>              SkipCount=1,
1>              Custom=LEESA::Default
1>          ]
1>C:mySVNLEESAincludeLEESA/SP_Accumulation.cpp(112): error C2866:
'LEESA::LevelDescendantsOp<Ancestor,Descendant,SkipCount,Custom>::mpl_assertion_in_line_130' : a const static data member
of a managed type must be initialized at the point of declaration
1>          with
1>          [
1>              Ancestor=LEESA::Carrier<library::Catalog>,
1>              Descendant=LEESA::Carrier<library::Country>,
1>              SkipCount=1,
1>              Custom=LEESA::Default
1>          ]
1> Generating Code...
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========                                               43   / 54
 (Nearly) all LEESA queries are expression templates
    Hand rolled. Not using Boost.Proto
                      Catalog() >> Book() >> Author() >> Name()

                                       3    ChainExpr


                                 n
                                io
                                                                        LResultType
                            ut
                           ec
                       Ex

                            2        ChainExpr    GetChildren<Author, Name>
                      of
                  er




                                                                    LResultType
               rd
              O




                  1        ChainExpr          GetChildren<Book, Author>
                                                                LResultType

               Catalog                     GetChildren<Catalog, Book>

  template <class L, class H>
  ChainExpr<L, GetChildren<typename ExpressionTraits<L>::result_type, H> >
  operator >> (L l, H h)
  {
    typedef typename ExpressionTraits<L>::result_type LResultType;
    typedef GetChildren<LResultType, H> GC;
    return ChainExpr<L, GC>(l, h);
  }
                                                                                      44 / 54
 (Nearly) all LEESA queries are expression templates
    Hand rolled. Not using Boost.Proto
    Every LEESA expression becomes a unary function object
    LEESA query  Systematically composed unary function objects



              Catalog() >>= Book() >> Author() >> Name()

              ChainExpr   1
                                                2
        Catalog           DepthFirstGetChildren<Catalog, Book>


                          Catalog          2b       ChainExpr


                                    2a ChainExpr          GetChildren<Author, Name>


                                    Book              GetChildren<Book, Author>

                                                                                      45 / 54
 XML Programming in C++, specifically data-binding
 What XML data binding stole from us!
 Restoring order: LEESA
 LEESA by examples
 LEESA in detail
      Architecture of LEESA
      Type-driven data access
      XML schema representation using Boost.MPL
      LEESA descendant axis and strategic programming
      Compile-time schema conformance checking
      LEESA expression templates
 Evaluation: productivity, performance, compilers
 C++0x and LEESA
 LEESA in future
                                                         46 / 54
 Reduction in boilerplate traversal code
    Results from the 2009 paper in the Working Conference on
     Domain-Specific Languages, Oxford, UK




                                            87% reduction in traversal
                                                     code
                                                                     47 / 54
 CodeSynthesis xsd data binding tool on the catalog xsd
 Abstraction penalty from construction, copying, and destruction of
  internal containers (std::vector<T> and LEESA::Carrier<T>)
 GNU Profiler: Highest time spent in std::vector<T>::insert and
  iterator dereference functions




                          (data binding)




                                                             33 seconds for
                                                           parsing, validating,
                                                            and object model
                                                              construction
                                                                                  48 / 54
   Compilation time affects programmer productivity
   Experiment
      An XML schema containing 300 types (4 recursive)
      gcc 4.5 (with and without variadic templates)



                            (data binding)




                                                          49 / 54
   Experiment: Total time to build an executable from an xsd on 4 compilers
      XML schema containing 300 types (4 recursive)
      5 LEESA expressions (all using descendant axis)
      Tested on Intel Core 2 Duo 2.67 GHz, 4 GB laptop




                   79          44           18
                                                         15

                   54
                              126
                                           112         101
                   60



                   95          95          95          95




                                                                         50 / 54
 Readability improvements
     Lambdas!
     LEESA actions (e.g., Select, Sort) can use C++0x lambdas
     static_assert for improved error reporting
     auto for naming LEESA expressions
 Performance improvements (run-time)
     Rvalue references and move semantics
     Optimize away internal copies of large containers
 Performance improvements (Compile-time)
     Variadic templates  Faster schema conformance
      checking
     No need to use BOOST_MPL_LIMIT_VECTOR_SIZE and
      Boost.Preprocessor tricks
 Simplifying LEESA’s implementation
     Trailing return-type syntax and decltype
     Right angle bracket syntax
                                                                 51 / 54
 Become a part of the Boost libraries!?
 Extend LEESA to support
      Google Protocol Buffers (GPB)
      Apache Thrift
      Or any “schema-first” data binding in C++
 Better support from data binding tools?
 Parallelization on multiple cores
      Parallelize query execution on multiple cores
       behind LEESA’s high-level declarative
       programming API
 Co-routine style programming model
      LEESA expressions return containers
      Expression to container  expensive!
      Expression to iterator  cheap!
      Compute result only when needed (lazy)
 XML literal construction
      Checked against schema at compile-time
                                                       52 / 54
LEESA  Native XML Processing Using Multi-paradigm
                   Design in C++
               XML Programming Concerns
            Representation
                                  Traversal      Static Schema
            and access to
                                 (up, down,      conformance
             richly-typed
                                 sideways)         checking
           hierarchical data

                          Statically
                                         Structure-shy
                         fixed depth

                         Breadth-first   Depth-first

            Object-oriented      Generative
             Programming        Programming
                                                 Metaprogramming
                Generic           Strategic
             programming        Programming

                C++ Multi-paradigm Solution
                                                                   53 / 54
54 / 54

Native XML processing in C++ (BoostCon'11)

  • 1.
    LEESA: Toward NativeXML Processing Using Multi-paradigm Design in C++ May 16, 2011 Dr. Sumant Tambe Dr. Aniruddha Gokhale Software Engineer Associate Professor of EECS Dept. Real-Time Innovations Vanderbilt University www.dre.vanderbilt.edu/LEESA 1 / 54
  • 2.
     XML Programmingin C++. Specifically, data binding  What XML data binding stole from us!  Restoring order: LEESA  LEESA by examples  LEESA in detail  Architecture of LEESA  Type-driven data access  XML schema representation using Boost.MPL  LEESA descendant axis and strategic programming  Compile-time schema conformance checking  LEESA expression templates  Evaluation: productivity, performance, compilers  C++0x and LEESA  LEESA in future 2 / 54
  • 3.
    XML Infoset Cɷ 3 / 54
  • 4.
     Type system  Regular types  Anonymous complex elements  Repeating subsequence  XML data model  XML information set (infoset)  E.g., Elements, attributes, text, comments, processing instructions, namespaces, etc. etc.  Schema languages  XSD, DTD, RELAX NG  Programming Languages  XPath, XQuery, XSLT  Idioms and best practices  XPath: Child, parent, sibling, descendant axes; wildcards 4 / 54
  • 5.
     Predominant categories& examples (non-exhaustive)  DOM API  Apache Xerces-C++, RapidXML, Tinyxml, Libxml2, PugiXML, lxml, Arabica, MSXML, and many more …  Event-driven APIs (SAX and SAX-like)  Apache SAX API for C++, Expat, Arabica, MSXML, CodeSynthesis XSD/e, and many more …  XML data binding  Liquid XML Studio, Code Synthesis XSD, Codalogic LMX, xmlplus, OSS XSD, XBinder, and many more …  Boost XML??  No XML library in Boost (as of May 16, 2011)  Issues: very broad requirements, large XML specifications, good XML libraries exist already, encoding issues, round tripping issues, and more … 5 / 54
  • 6.
    XML query/traversal program XML Uses Schema XML Schema C++ Object-oriented i/p Compiler Generate Data Access Layer i/p Generate Executable Compiler (Code Generator) C++ Code  Process  Automatically generate vocabulary-specific classes from the schema  Develop application code using generated classes  Parse an XML into an object model at run-time  Manipulate the objects directly (CRUD)  Serialize the objects back to XML 6 / 54
  • 7.
     Example: Bookcatalog xml and xsd <catalog> <xs:complexType name=“book”> <book> <xs:sequence> <name>The C++ Programming Language</name> <xs:element name="name" type="xs:string" /> <price>71.94</price> <xs:element name="price" type="xs:double" /> <xs:element name="author" maxOccurs="unbounded"> <author> <xs:complexType> <name>Bjarne Stroustrup</name> <xs:sequence> <country>USA</country> <xs:element name="name" type="xs:string" /> </author> <xs:element name="country" type="xs:string" /> </book> </xs:sequence> <book> </xs:complexType> <name>C++ Coding Standards</name> </xs:element> <price>36.41</price> </xs:sequence> <author> </xs:complexType> <name>Herb Sutter</name> <country>USA</country> <xs:element name="catalog"> <xs:complexType> </author> <xs:sequence> <author> <xs:element name=“book” <name>Andrei Alexandrescu</name> type=“lib:book" <country>USA</country> maxOccurs="unbounded"> </author> </xs:element> </book> </xs:sequence> </catalog> </xs:complexType> </xs:element> 7 / 54
  • 8.
     Example: Bookcatalog xsd and generated C++ code <xs:complexType name=“book”> class author { <xs:sequence> private: <xs:element name="name" std::string name_; type="xs:string" /> std::string country_; <xs:element name="price" public: type="xs:double" /> std::string get_name() const; <xs:element name="author" maxOccurs="unbounded"> void set_name(std::string const &); <xs:complexType> std::string get_country() const; <xs:sequence> void set_country(std::string const &); <xs:element name="name" }; type="xs:string" /> <xs:element name="country" class book { type="xs:string" /> private: std::string name_; </xs:sequence> double price_; </xs:complexType> std::vector<author> author_sequence_; </xs:element> public: std::string get_name() const; </xs:sequence> void set_name(std::string const &); </xs:complexType> double get_price() const; <xs:element name="catalog"> void set_price(double); <xs:complexType> std::vector<author> get_author() const; <xs:sequence> void set_author(vector<author> const &); <xs:element name=“book” }; type=“lib:book" class catalog { maxOccurs="unbounded"> private: </xs:element> std::vector<book> book_sequence_; </xs:sequence> public: </xs:complexType> std::vector<book> get_book() const; </xs:element> void set_book(std::vector<book> const &); }; 8 / 54
  • 9.
     Book catalogapplication program  Example: Find all author names std::vector<std::string> get_author_names (const catalog & root) { std::vector<std::string> name_seq; for (catalog::book_const_iterator bi (root.get_book().begin ()); bi != root.get_book().end (); ++bi) { for (book::author_const_iterator ai (bi->get_author().begin ()); ai != bi->get_author().end (); ++ai) { name_seq.push_back(ai->name()); } } return name_seq; }  Advantages of XML data binding  Easy to use  C++ programming style and idioms  Vocabulary-specific API  Efficient  Type safety 9 / 54
  • 10.
     We lostsomething along the way. A lot actually!  Loss of succinctness  XML child axis replaced by nested for loops  Example: Find all author names Using XML data binding (20 lines) Using XPath (1 line) /book/author/name/text() std::vector<std::string> get_author_names (const catalog & root) { std::vector<std::string> name_seq; for (catalog::book_const_iterator bi = root.get_book().begin (); bi != root.get_book().end (); ++bi) { for (book::author_const_iterator ai = bi->get_author().begin ()); ai != bi->get_author().end (); ++ai) { name_seq.push_back(ai->name()); } } return name_seq; } 10 / 54
  • 11.
     Loss ofexpressive power  Example: “Find all names recursively”  What if catalogs are recursive too!  Descendant axis replaced by manual recursion. Hard to maintain. Using XPath (1 line) Using XML data binding using //name/text() BOOST_FOREACH (20+ lines) std::vector<std::string> get_author_names (const catalog & c) <catalog> { <catalog> std::vector<std::string> name_seq; <catalog> BOOST_FOREACH(const book &b, c.get_book()) <catalog> { <book><name>...</name></book> BOOST_FOREACH(const author &a, b.get_author()) <book><name>...</name></book> { </catalog> name_seq.push_back(a.name()); <book>...</book> } } <book>...</book> return name_seq; </catalog> } <book> <name>...</name> std::vector<std::string> get_all_names (const catalog & root) <price>...</price> { <author> std::vector<std::string> name_seq(get_author_names(root)); <name>...</name> BOOST_FOREACH (const catalog &c, root.get_catalog()) <country>...</country> { </author> std::vector<std::string> names = get_all_names(c); name_seq.insert(names.begin(), names.end()); </book> } <book>...</book> return name_seq; <book>...</book> } </catalog> </catalog> 11 / 54
  • 12.
     Loss ofXML programming idioms  Cannot use “wildcard” types  Example: Without spelling “Catalog” and “Book”, find names that are exactly at the third level. Using XPath (1 line) Using XML data binding /*/*/name/text() std::vector<std::string> get_author_names (const catalog & root) { std::vector<std::string> name_seq; . . . . . . return name_seq; }  Also known as structure-shyness  Descendant axis and wildcards don’t spell out every detail of the structure  Casting Catalog to Object class isn’t good enough  object.get_book()  compiler error!  object.get_children()  Inevitable casting! 12 / 54
  • 13.
     Hybrid approach:Pass XPath expression as a string Using XML data binding + XPath  No universal support  Boilerplate setup code DOMElement* root (static_cast<DOMElement*> (c._node ())); DOMDocument* doc (root->getOwnerDocument ());  DOM, XML namespaces, dom::auto_ptr<DOMXPathExpression> expr ( doc->createExpression ( xml::string ("//author").c_str (), resolver.get ())); Memory management dom::auto_ptr<DOMXPathResult> r ( expr->evaluate (  Casting is inevitable  Look and feel of two doc, DOMXPathResult::ITERATOR_RESULT_TYPE, 0)); APIs is (vastly) different while (r->iterateNext ()) { DOMNode* n (r->getNodeValue ()); author* a (  iterateNext() Vs. static_cast<author*> ( n->getUserData (dom::tree_node_key))); begin()/end() } cout << "Name : " << a->get_name () << endl;  Can’t use predicates on data outside xml  E.g. Find authors of highest selling books “/book[?condition?]/author/name” 13 / 54
  • 14.
     Schema-specificity (tomuch object-oriented bias?)  Each class has a different interface (not generic)  Naming convention of XML data binding tools vary Catalog Book Author +get_Book() +get_Author() +get_Name() +get_Price() +get_Country() +get_name()  Lost succinctness (axis-oriented expressions)  Lost structure-shyness (descendant axis, wildcards)  Can’t use Visitor design pattern (stateful traversal) with XPath 14 / 54
  • 15.
    Language for EmbeddedQuEry and TraverSAl Multi-paradigm Design in C++ 15 / 54
  • 16.
    * Catalog  A book catalog xsd +get_Book() +get_Catalog() 1  Generated six C++ classes 1 *  Catalog 1 1 Book  Book Complex classes Price +get_Author()  Author +get_Price() 1 +get_Name() Name  Price Simple classes 1 1 * 1  Country Country 1 1 Author  Name +get_Name() +get_Country() 1  Price, Country, and Name <catalog> <catalog> are simple wrappers  Catalogs are recursive <catalog> <catalog>...</catalog> </catalog> <book> <name>...</name> <price>...</price> <author> <name>...</name> <country>...</country> </author> </book> </catalog> </catalog> 16 / 54
  • 17.
    *  Restoring succinctness Catalog +get_Book() 1  Example: Find all author names +get_Catalog() 1  Child axis traversal * Book Price 1 1 +get_Author() +get_Price() 1 +get_Name() Name 1 1 Using XPath (1 line) * 1 1 1 Author Country /book/author/name/text() +get_Name() 1 +get_Country() Using LEESA (3 lines) Catalog croot = load_catalog(“catalog.xml”); std::vector<Name> author_names = evaluate(croot, Catalog() >> Book() >> Author() >> Name()); 17 / 54
  • 18.
    *  Restoring expressivepower Catalog +get_Book() 1  Example: Find all names recursively +get_Catalog() 1  Descendant axis traversal * Book Price 1 1 +get_Author() +get_Price() 1 +get_Name() Name 1 1 Using XPath (1 line) * 1 1 1 Author Country //name/text() +get_Name() 1 +get_Country() Using LEESA (2 lines) Catalog croot = load_catalog(“catalog.xml”); std::vector<Name> names = DescendantsOf(Catalog(), Name())(croot);  Fully statically typed execution  Efficient: LEESA “knows” where Names are! 18 / 54
  • 19.
     Restoring xmlprogramming * Catalog idioms (structure-shyness) +get_Book() +get_Catalog() 1  Example: Without spelling intermediate 1 types, find names that are exactly at * Book the third level. Price 1 1 +get_Author()  Wildcards in a typed query! +get_Price() 1 +get_Name() Name 1 1 Using XPath (1 line) * 1 1 1 Author Country /*/*/name/text() +get_Name() 1 +get_Country() Using LEESA (3 lines) namespace LEESA { struct Underbar {} _; } Catalog croot = load_catalog(“catalog.xml”); std::vector<Name> names = LevelDescendantsOf(Catalog(), _, _, Name())(croot);  Fully statically typed execution  Efficient: LEESA “knows” where Books, Authors, and Names are! 19 / 54
  • 20.
    *  User-defined filters Catalog  Example: Find names of authors from +get_Book() +get_Catalog() 1 Country == USA 1 *  Basically unary functors Book 1 1  Supports free functions, function Price +get_Author() objects, boost::bind, C++0x lambda +get_Price() 1 +get_Name() Name 1 1 * 1 1 1 Author Country +get_Name() 1 Using XPath (1 line) +get_Country() //author[country/text() = ‘USA’]/name/text() Using LEESA (6 lines) Catalog croot = load_catalog(“catalog.xml”); std::vector<Name> author_names = evaluate(croot, Catalog() >> DescendantsOf(Catalog(), Author()) >> Select(Author(), [](const Author &a) { return a.get_Country() == “USA"; }) >> Name()); 20 / 54
  • 21.
    *  Tuplefication!! Catalog  Example: Pair the name and country of +get_Book() +get_Catalog() 1 all the authors 1 *  std::vector of Book Price 1 1 boost::tuple<Name *, Country *> +get_Author() +get_Price() 1 +get_Name() Name 1 1 * 1 1 1 Author Country +get_Name() 1 Using XPath +get_Country() ??????????????????????????????? Using LEESA (5 lines) Catalog croot = load_catalog(“catalog.xml”); std::vector<boost::tuple<Name *, Country *> > tuples = evaluate(croot, Catalog() >> DescendantsOf(Catalog(), Author()) >> MembersAsTupleOf(Author(), make_tuple(Name(), Country()))); 21 / 54
  • 22.
    *  Using visitors MyVisitor Catalog +visit_Catalog()  Gang-of-four Visitor design pattern +visit_Book() +get_Book() 1 +visit_Author() +get_Catalog() +visit_Name()  Visit methods for all Elements +visit_Country() 1 +visit_Price() *  Example: Visit catalog, books, authors, Price 1 1 Book and names in that order +get_Author() +get_Price() 1  Stateful, statically typed traversal +get_Name() Name 1 1  fixed depth child axis * 1 1 1 Author Country +get_Name() 1 Using XPath +get_Country() ??????????????????????????????? Catalog Using LEESA (7 lines) Book1 Book2 Catalog croot = load_catalog(“catalog.xml”); MyVisitor visitor; std::vector<Country> countries = A1 A2 A3 A4 evaluate(croot, Catalog() >> visitor >> Book() >> visitor >> Author() >> visitor C1 C4 C2 C3 >> Country() >> visitor); 22 / 54
  • 23.
    *  Using visitors(depth-first) MyVisitor Catalog +visit_Catalog()  Gang-of-four Visitor design pattern +visit_Book() +get_Book() 1 +visit_Author() +get_Catalog() +visit_Name()  Visit methods for all Elements +visit_Country() 1 +visit_Price() *  Example: Visit catalog, books, authors, Price 1 1 Book and names in depth-first manner +get_Author() +get_Price() 1  Stateful, statically typed traversal +get_Name() Name 1  fixed depth child axis 1 * 1 1 1 Author Country +get_Name() Using XPath 1 +get_Country() ??????????????????????????????? Catalog Default precedence. Using LEESA (7 lines) No parenthesis needed. Book1 Book2 Catalog croot = load_catalog(“catalog.xml”); MyVisitor visitor; std::vector<Book> books = evaluate(croot, Catalog() >> visitor A1 A2 A3 A4 >>= Book() >> visitor >>= Author() >> visitor >>= Country() >> visitor); C1 C2 C3 C4 23 / 54
  • 24.
    Visited Child Axis Child Axis Parent Axis Parent Axis (breadth-first) (depth-first) (depth-first) (breadth-first) Catalog() >> Book() >> v >> Author() >> v Catalog() >>= Book() >> v >>= Author() >> v Default precedence. Name() << v << Author() << v << Book() << v No parenthesis needed. Name() << v <<= Author() << v <<= Book() << v 24 / 54
  • 25.
    *  Composing namedqueries MyVisitor Catalog +visit_Catalog()  Queries can be named, composed, and +get_Book() +get_Catalog() 1 +visit_Book() +visit_Author() passed around as executable +visit_Name() +visit_Country() 1 expressions * +visit_Price() Book  Example: Price 1 1 +get_Author() For each book +get_Price() +get_Name() 1 Name print(country of the author) 1 1 * 1 print(price of the book) Country 1 1 Author +get_Name() Using XPath 1 +get_Country() ??????????????????????????????? Using LEESA (6 lines) Catalog croot = load_catalog(“catalog.xml”); MyVisitor visitor; BOOST_AUTO(v_country, Author() >> Country() >> visitor); BOOST_AUTO(v_price, Price() >> visitor); BOOST_AUTO(members, MembersOf(Book(), v_country, v_price)); evaluate(croot, Catalog() >>= Book() >> members); 25 / 54
  • 26.
     Using visitors(recursively)  Hierarchical Visitor design pattern  Visit and Leave methods for all elements  Depth awareness  Example: Visit everything!!  Stateful, statically typed traversal  Descendant axis = recursive  AroundFullTD = AroundFullTopDown Using XPath ??????????????????????????????? Using LEESA (3 lines!!) Catalog croot = load_catalog(“catalog.xml”); MyHierarchicalVisitor v; AroundFullTD(Catalog(), VisitStrategy(v), LeaveStrategy(v)))(croot); 26 / 54
  • 27.
     LEESA 1. Is not an xml parsing library XML data binding tool 2. Does not validate xml files can do both 3. Does not replace/compete with XPath 4. Does not resolve X/O impedance mismatch  More reading: “Revealing X/O impedance mismatch”, Dr. R Lämmel  LEESA 1. Is a query and traversal library for C++ 2. Validates XPath-like queries at compile-time (schema conformance) 3. Is motivated by XPath 4. Goes beyond XPath 5. Simplifies typed XML programming 6. Is an embedded DSEL (Domain-specific embedded language) 7. Is applicable beyond xml (E.g., Google Protocol Buffers, model traversal, hand coded class hierarchies, etc.) 27 / 54
  • 28.
     XML Programmingin C++, specifically data-binding  What XML data binding stole from us!  Restoring order: LEESA  LEESA by examples  LEESA in detail  Architecture of LEESA  Type-driven data access  XML schema representation using Boost.MPL  LEESA descendant axis and strategic programming  Compile-time schema conformance checking  LEESA expression templates  Evaluation: productivity, performance, compilers  C++0x and LEESA  LEESA in future 28 / 54
  • 29.
     The Process LEESA Expressions Written by Programmers Axes-oriented Recursive Traversal Traversal Expressions (Strategic Programming) Ch ec ke d es ag ai Us ns t XML i/p Schema Static Generate meta-information i/p Extended Schema Type-driven i/p Compiler Generate Data Access Layer C++ i/p Generate Executable (Code Compiler Generator) Object-oriented Generate Data Access Layer i/p C++ Code 29 / 54
  • 30.
    XML Schema Static Type-driven Visitor meta- Data Access Declarations information Layer C++ (.h, .cpp) Object-oriented Meta-data Schema XML ALL Data Access Doxygen XML XML XML XSLT Generator Compiler XML Layer C++ (.h) LEESA’s gen-meta.py script  Extended schema compiler = 4 step process  XML schema language (XSD) specification is huge and complex  Don’t reinvent the wheel: xml data binding tools already process it  Naming convention of xml data binding tools vary  Applicability beyond xml data binding  E.g. Google Protocol Buffers (GPB), hand written class hierarchies  Meta-data generator script inserts visitor declaration in the C++ classes 30 / 54
  • 31.
    To fix  Different interface of each class  Generic API “children” wrappers to navigate aggregation  Generated by the Python script  More amenable to composition std::vector<Book> children (Catalog &c, Book const *) { return c.get_Book(); } std::vector<Catalog> children (Catalog &c, Catalog const *) { return c.get_Catalog(); } std::vector<Author> children (Book &b, Author const *) { return b.get_Author(); } Price children (Book &b, Price const *) { return b.get_Price(); } Name children (Book &b, Name const *) { return b.get_Name(); } Country children (Author &a, Country const *) { return a.get_Country(); } Name children (Author &a, Name const *) { return a.get_Name(); } 31 / 54
  • 32.
     Ambiguity!  Simple elements and attributes are mapped to built-in types  “children” function overloads become ambiguous <xs:complexType name=“Author”> <xs:sequence> <xs:element name=“first_name" type="xs:string" /> Mapping <xs:element name=“last_name“ type="xs:string" /> </xs:sequence> </xs:complexType> gen-meta.py std::string children (Author &a, std::string const *) { return a.get_first_name(); } std::string children (Author &a, std::string const *) { return a.get_last_name(); } 32 / 54
  • 33.
     Solution 1:Automatic schema transformation  Force data binding tools to generate unique C++ types  gen-meta.py can transforms input xsd while preserving semantics <xs:complexType name=“Author”> <xs:sequence> <xs:element name=“first_name" type="xs:string" /> Mapping <xs:element name=“last_name“ type="xs:string" /> </xs:sequence> </xs:complexType> Transformation (gen-meta.py) <xs:complexType name=“Author”> <xs:sequence> <xsd:element name=“first_name"> Mapping <xsd:simpleType> <xsd:restriction base="xsd:string" /> </xsd:simpleType> </xsd:element> <xsd:element name=“last_name"> <xsd:simpleType> <xsd:restriction base="xsd:string" /> </xsd:simpleType> </xsd:element> </xs:sequence> </xs:complexType> 33 / 54
  • 34.
     Solution 1limitations: Too many types! Longer compilation times.  Solution 2: Generate placeholder types  Create unique type aliases using a template and integer literals  Not implemented! <xs:complexType name=“Author”> <xs:sequence> <xs:element name=“first_name" type="xs:string" /> <xs:element name=“last_name“ type="xs:string" /> </xs:sequence> </xs:complexType> Code generation (gen-meta.py) namespace LEESA { template <class T, unsigned int I> struct unique_type { typedef T nested; }; } namespace Library { typedef LEESA::unique_type<std::string, 1> first_name; typedef LEESA::unique_type<std::string, 2> last_name; } 34 / 54
  • 35.
     A keyidea in LEESA  Externalize structural meta-information using Boost.MPL  LEESA’s meta-programs traverse the meta-information at compile-time template <class Kind> * struct SchemaTraits { Catalog typedef mpl::vector<> Children; // Empty sequence }; +get_Book() 1 +get_Catalog() template <> 1 struct SchemaTraits <Catalog> * { Book typedef mpl::vector<Book, Catalog> Children; Price 1 1 }; +get_Author() template <> +get_Price() 1 +get_Name() Name struct SchemaTraits <Book> 1 { 1 typedef mpl::vector<Name, Price, Author> Children; * 1 }; Country 1 1 Author template <> struct SchemaTraits <Author> +get_Name() { 1 +get_Country() typedef mpl::vector<Name, Country> Children; }; 35 / 54
  • 36.
     A keyidea in LEESA  Externalize structural meta-information using Boost.MPL  Descendant meta-information is a transitive closure of Children template <class Kind> struct SchemaTraits { typedef mpl::vector<> Children; // Empty sequence * }; template <> struct SchemaTraits <Catalog> { Catalog typedef mpl::vector<Book, Catalog> Children; }; +get_Book() +get_Catalog() 1 template <> struct SchemaTraits <Book> { typedef mpl::vector<Name, Price, Author> Children; 1 }; * template <> struct SchemaTraits <Author> { Book typedef mpl::vector<Name, Country> Children; Price 1 1 }; +get_Author() typedef boost::mpl::true_ True; +get_Price() 1 +get_Name() Name typedef boost::mpl::false_ False; 1 template<class A, class D> struct IsDescendant : False {}; 1 template<> struct IsDescendant<Catalog, Catalog> : True {}; * 1 template<> struct IsDescendant<Catalog, Book> : True {}; 1 1 Author template<> struct IsDescendant<Catalog, Name> : True {}; Country template<> struct IsDescendant<Catalog, Price> : True {}; +get_Name() template<> struct IsDescendant<Catalog, Author> : True {}; 1 +get_Country() template<> struct IsDescendant<Catalog, Country> : True {}; template<> struct IsDescendant<Book, Name> : True {}; template<> struct IsDescendant<Book, Price> : True {}; template<> struct IsDescendant<Book, Author> : True {}; template<> struct IsDescendant<Book, Country> : True {}; template<> struct IsDescendant<Author, Name> : True {}; template<> struct IsDescendant<Author, Country> : True {}; 36 / 54
  • 37.
    std::vector<Country> countries =DescendantsOf(Catalog(), Country())(croot);  Algorithm (conceptual) 1. IsDescendant<Catalog, Country>::value Catalog 2. Find all children types of Catalog SchemaTraits<Catalog>::Children = boost::mpl::vector<Book, Catalog> 3. Iterate over Boost.MPL vector Book Catalog 4. IsDescendant<Book, Country>::value 5. Use type-driven data access on each Catalog std::vector<Book>=children(Catalog&, Book*) Name Author Price For Catalogs repeat step (1) 6. Find all children types of Book SchemaTraits<Book>::Children = boost::mpl::vector<Name, Author, Price> Country Name 7. Iterate over Boost.MPL vector 8. IsDescendant<Name, Country>::value 9. IsDescendant<Price, Country>::value 10. IsDescendant<Author, Country>::value 11. Use type drive data access on each Book std::vector<Author>=children(Book&, Author*) 12. Find all children types of Author SchemaTraits<Author>::Children = boost::mpl::vector<Country, Name> 13. Repeat until Country objects are found 37 / 54
  • 38.
     Strategic ProgrammingParadigm  A systematic way of creating recursive tree traversal  Developed in 1998 as a term rewriting language: Stratego  Why LEESA uses strategic programming  Generic  LEESA can be designed without knowing the types in a xml tree  Recursive  LEESA can handles mutually and/or self recursive types  Reusable  LEESA can be reused as a library for any xsd  Composable  LEESA can be extended by its users using policy-based templates  Basic combinators  Identity, Fail, Sequence, Choice, All, and One 38 / 54
  • 39.
    fullTD(node) fullTD(node) All(node, strategy) { { { visit(node); visit(node); forall children c of node forall children c of node All(node, fullTD); strategy(c); fullTD(c); } } } Pre-order traversal pseudo-code (fullTopDown) fullTD(node) { Recursive seq(node, visit, All(fullTD)); traversal (1 out of many) } seq(node,strategy1,strategy2) { strategy1(node); strategy2(node); } Basic All(node, strategy) Combinators { (2 out of 6) forall children c of node strategy(c); } 39 / 54
  • 40.
    template <class Strategy1, template <class Strategy> class Strategy2> class All Boost.MPL class Seq { Meta-information { template <class Data> template <class Data> void operator()(Data d) void operator()(Data d) { { foreach T in SchemaTraits<Data>::Children Strategy1(d); std::vector<T> t = children(d, (T *)0); Strategy2(d); Strategy(t); } } Type-driven }; }; Data Access Sequence + All = FullTD template <class Strategy> class FullTD { template <class data> void operator()(Data d) { Seq<Strategy,All<FullTD>>(d); } }; Note: Objects and constructors omitted for brevity 40 / 54
  • 41.
    * BOOST_AUTO(prices, DescendantsOf(Catalog(),Price())); Catalog  LEESA uses FullTopDown<Accumulator<Price>> +get_Book() 1  But schema unaware recursion in every sub-structure +get_Catalog() is inefficient 1 *  We know that Authors do not contain Prices Book Price 1 1 +get_Author() LEESA’s +get_Price() +get_Name() FullTD may be schema-aware 1 inefficient traversal is optimal * 1 1 Author Country +get_Name() +get_Country() IsDescendant <Catalog,Price> = True IsDescendant <Author,Price> = False Bypass unnecessary sub-structures (Author) using meta-programming 41 / 54
  • 42.
     LEESA hascompile-time schema conformance checking  LEESA queries compile only if they agree with the schema  Uses externalized schema and meta-programming  Error message using BOOST_MPL_ASSERT  Tries to reduce long and incomprehensible error messages  Shows assertion failures in terms of concepts  ParentChildConcept, DescendantKindConcept, etc.  Originally developed for C++0x concepts  Examples DescendantKindConcept Failure ParentChildConcept Failure 1. BOOST_AUTO(prices, DescendantsOf(Author(), Price()); 2. BOOST_AUTO(books, Catalog() >> Book() >> Book()); 3. BOOST_AUTO(countries, LevelDescendantsOf(Catalog(),_,Country()); LevelDescendantKindConcept Failure 42 / 54
  • 43.
     Country isat least 2 “steps” away from a Catalog LevelDescendantsOf(Catalog(),_,Country()); 1>------ Build started: Project: library, Configuration: Release Win32 ------ 1> driver.cxx 1> using native typeof 1>C:mySVNLEESAincludeLEESA/SP_Accumulation.cpp(112): error C2664: 'boost::mpl::assertion_failed' : cannot convert parameter 1 from 'boost::mpl::failed ************LEESA::LevelDescendantKindConcept<ParentKind,DescendantKind,SkipCount,Custom>::* ***********' to 'boost::mpl::assert<false>::type' 1> with 1> [ 1> ParentKind=library::Catalog, 1> DescendantKind=library::Country, 1> SkipCount=1, 1> Custom=LEESA::Default 1> ] 1> No constructor could take the source type, or constructor overload resolution was ambiguous 1> driver.cxx(155) : see reference to class template instantiation 'LEESA::LevelDescendantsOp<Ancestor,Descendant,SkipCount,Custom>' being compiled 1> with 1> [ 1> Ancestor=LEESA::Carrier<library::Catalog>, 1> Descendant=LEESA::Carrier<library::Country>, 1> SkipCount=1, 1> Custom=LEESA::Default 1> ] 1>C:mySVNLEESAincludeLEESA/SP_Accumulation.cpp(112): error C2866: 'LEESA::LevelDescendantsOp<Ancestor,Descendant,SkipCount,Custom>::mpl_assertion_in_line_130' : a const static data member of a managed type must be initialized at the point of declaration 1> with 1> [ 1> Ancestor=LEESA::Carrier<library::Catalog>, 1> Descendant=LEESA::Carrier<library::Country>, 1> SkipCount=1, 1> Custom=LEESA::Default 1> ] 1> Generating Code... ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ========== 43 / 54
  • 44.
     (Nearly) allLEESA queries are expression templates  Hand rolled. Not using Boost.Proto Catalog() >> Book() >> Author() >> Name() 3 ChainExpr n io LResultType ut ec Ex 2 ChainExpr GetChildren<Author, Name> of er LResultType rd O 1 ChainExpr GetChildren<Book, Author> LResultType Catalog GetChildren<Catalog, Book> template <class L, class H> ChainExpr<L, GetChildren<typename ExpressionTraits<L>::result_type, H> > operator >> (L l, H h) { typedef typename ExpressionTraits<L>::result_type LResultType; typedef GetChildren<LResultType, H> GC; return ChainExpr<L, GC>(l, h); } 44 / 54
  • 45.
     (Nearly) allLEESA queries are expression templates  Hand rolled. Not using Boost.Proto  Every LEESA expression becomes a unary function object  LEESA query  Systematically composed unary function objects Catalog() >>= Book() >> Author() >> Name() ChainExpr 1 2 Catalog DepthFirstGetChildren<Catalog, Book> Catalog 2b ChainExpr 2a ChainExpr GetChildren<Author, Name> Book GetChildren<Book, Author> 45 / 54
  • 46.
     XML Programmingin C++, specifically data-binding  What XML data binding stole from us!  Restoring order: LEESA  LEESA by examples  LEESA in detail  Architecture of LEESA  Type-driven data access  XML schema representation using Boost.MPL  LEESA descendant axis and strategic programming  Compile-time schema conformance checking  LEESA expression templates  Evaluation: productivity, performance, compilers  C++0x and LEESA  LEESA in future 46 / 54
  • 47.
     Reduction inboilerplate traversal code  Results from the 2009 paper in the Working Conference on Domain-Specific Languages, Oxford, UK 87% reduction in traversal code 47 / 54
  • 48.
     CodeSynthesis xsddata binding tool on the catalog xsd  Abstraction penalty from construction, copying, and destruction of internal containers (std::vector<T> and LEESA::Carrier<T>)  GNU Profiler: Highest time spent in std::vector<T>::insert and iterator dereference functions (data binding) 33 seconds for parsing, validating, and object model construction 48 / 54
  • 49.
    Compilation time affects programmer productivity  Experiment  An XML schema containing 300 types (4 recursive)  gcc 4.5 (with and without variadic templates) (data binding) 49 / 54
  • 50.
    Experiment: Total time to build an executable from an xsd on 4 compilers  XML schema containing 300 types (4 recursive)  5 LEESA expressions (all using descendant axis)  Tested on Intel Core 2 Duo 2.67 GHz, 4 GB laptop 79 44 18 15 54 126 112 101 60 95 95 95 95 50 / 54
  • 51.
     Readability improvements  Lambdas!  LEESA actions (e.g., Select, Sort) can use C++0x lambdas  static_assert for improved error reporting  auto for naming LEESA expressions  Performance improvements (run-time)  Rvalue references and move semantics  Optimize away internal copies of large containers  Performance improvements (Compile-time)  Variadic templates  Faster schema conformance checking  No need to use BOOST_MPL_LIMIT_VECTOR_SIZE and Boost.Preprocessor tricks  Simplifying LEESA’s implementation  Trailing return-type syntax and decltype  Right angle bracket syntax 51 / 54
  • 52.
     Become apart of the Boost libraries!?  Extend LEESA to support  Google Protocol Buffers (GPB)  Apache Thrift  Or any “schema-first” data binding in C++  Better support from data binding tools?  Parallelization on multiple cores  Parallelize query execution on multiple cores behind LEESA’s high-level declarative programming API  Co-routine style programming model  LEESA expressions return containers  Expression to container  expensive!  Expression to iterator  cheap!  Compute result only when needed (lazy)  XML literal construction  Checked against schema at compile-time 52 / 54
  • 53.
    LEESA  NativeXML Processing Using Multi-paradigm Design in C++ XML Programming Concerns Representation Traversal Static Schema and access to (up, down, conformance richly-typed sideways) checking hierarchical data Statically Structure-shy fixed depth Breadth-first Depth-first Object-oriented Generative Programming Programming Metaprogramming Generic Strategic programming Programming C++ Multi-paradigm Solution 53 / 54
  • 54.