DZone
Web Dev Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Web Dev Zone > Adventures in Parsing C: ASTs for Switch Statements

Adventures in Parsing C: ASTs for Switch Statements

Eli Bendersky user avatar by
Eli Bendersky
·
Apr. 15, 12 · Web Dev Zone · Interview
Like (0)
Save
Tweet
5.26K Views

Join the DZone community and get the full member experience.

Join For Free

Last week I received an email from a user of pycparser that mentioned the strange AST that results when pycparser parses a switch statement.

Let’s take the following snippet of C code for example. Don’t look for semantic sense in it – it’s just used to test the parser:

switch (myvar) {
    case 10:
        k = 10;
        p = k + 1;
        return 10;
    case 20:
    case 30:
        return 20;
    default:
        break;
}

And the AST pycparser was generating for this code:

Switch:
  ID: myvar
  Compound:
    Case:
      Constant: int, 10
      Assignment: =
        ID: k
        Constant: int, 10
    Assignment: =
      ID: p
      BinaryOp: +
        ID: k
        Constant: int, 1
    Return:
      Constant: int, 10
    Case:
      Constant: int, 20
      Case:
        Constant: int, 30
        Return:
          Constant: int, 20
    Default:
      Break:

There are two problems here:

  1. Only the first statement inside each case is made a child of that case – the other statements are siblings.
  2. Two consecutive case statements without any other statements in between (fall-through) cause the second case to become the child of the first one. If additional consecutive case statements follow, they nest even further.

Since the parser follows the C grammar pretty closely, I immediately went to look into the C99 standard, and indeed, this is exactly the parse tree that it mandates. Here’s the relevant portion of the language grammar (from section A.2.3):

(6.8) statement:
              labeled-statement
              compound-statement
              expression-statement
              selection-statement
              iteration-statement
              jump-statement
(6.8.1) labeled-statement:
              identifier : statement
              case constant-expression : statement
              default : statement

Note that a case (and default, which is equivalent to case in this whole discussion) must be followed by one, and only one other statement. This explains why pycparser parses the code above the way it does.

However, the goal of pycparser is not to generate a parse tree. It is to generate an abstract syntax tree (AST), which follows the language semantics rather than its grammar. Hey, I already wrote about this stuff!

So today I fixed this part of pycparser, by adding a dedicated AST transformation after parsing a switch statement. The transformation isn’t really complicated, and the AST pycparser generates now is much friendlier. Here it is, for the same code:

Switch:
  ID: myvar
  Compound:
    Case:
      Constant: int, 10
      Assignment: =
        ID: k
        Constant: int, 10
      Assignment: =
        ID: p
        BinaryOp: +
          ID: k
          Constant: int, 1
      Return:
        Constant: int, 10
    Case:
      Constant: int, 20
    Case:
      Constant: int, 30
      Return:
        Constant: int, 20
    Default:
      Break:

As you can see, the problems mentioned above were fixed. This fix is available in the pycparser Mercurial repository and will be part of the next release.

 

Abstract syntax Tree (data structure) Semantics (computer science) Parser (programming language) Snippet (programming) Testing source control Release (agency) IT

Published at DZone with permission of Eli Bendersky. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Testing Under the Hood Or Behind the Wheel
  • Servlets Listeners Introduction and Examples
  • How to Configure Git in Eclipse IDE
  • Instancio: Random Test Data Generator for Java (Part 1)

Comments

Web Dev Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo