Building a JSON Parser in Rust Pt. 4

September 7th, 2024

Welcome to the last part of the Building a JSON Parser in Rust series! If you haven't read part 1, 2, and 3, please do so first. For a quick refesher, in part 1 we built a parser that recognized empty JSON objects such as {}. In part 2, we built extended this parser to support key-value pairs where all values are strings. In part 3, we added support for parsing booleans, numbers, and null values.

Step 4. Parsing arrays and objects

This step will involve parsing arrays and objects as children. This will allow us to achieve the infinitely nested JSON structure that we see in reald-world JSON files. A sample JSON file is given below:

{
    "key1": [1, 2, 3],
    "key2": {
        "key3": "value3"
    },
    "key4": [
        {
            "key5": "value5"
        }
    ]
}

Feel free to use this as a valid test case. An invalid test case at this point, would be a file that has some sort of invalid syntax (for example, a missing , or ").

The first thing we need to do is augment our parse_value function to include support for objects and arrays.

fn parse_value(chars: &mut Peekable<Chars>) -> Result<Value, String> {
    match chars.peek() {
        ...
        Some('[') => parse_array(chars),
        Some('{') => parse_object(chars),
        _ => Err("Invalid JSON: failed to parse value".to_string())
    }f
}

Because of the way we've already organized our code, the parse_object function is already written for us! All we need to do is write a function to parse arrays.

At the core, an array in JSON is just a list of values. Since we already have a function to parse values (parse_values), we just need to write a function that will parse a list of values.

The first step is to ensure that we are parsing an array, and check the first character in the peekable is a [.

if let Some('[') = chars.peek() {
    chars.next();
} else {
    return Err("Invalid JSON: expected opening bracket".to_string());
}

Then, we should handle the case of an empty array (where the next character is a ]).

if let Some(']') = chars.peek() {
        chars.next();
        return Ok(true);
    }

Now, we can parse the values in the array. This involves the following two steps:

  1. parse a value
  2. parse a comma, or a closing bracket
while let Some(_) = chars.peek() {
    if parse_value(chars).is_err() {
        return Err("Invalid JSON: failed to parse value in array".to_string());
    }

     match chars.next() {
            Some(',') => (),
            Some(']') => return Ok(true),
            _ => return Err("Invalid JSON: expected comma or closing bracket after value".to_string())
    }
}

return Err("Invalid JSON: failed to parse array".to_string()); // if we get here, we have an invalid JSON

and that's it! The full function looks like:

fn parse_array(chars: &mut Peekable<Chars>) -> Result<bool, String> {
    if let Some('[') = chars.peek() {
        chars.next();
    } else {
        return Err("Invalid JSON: Expected opening bracket".to_string());
    }

    if let Some(']') = chars.peek() {
        chars.next();
        return Ok(true);
    }

    while let Some(_) = chars.peek() {
        if parse_value(chars).is_err() {
            return Err("Invalid JSON: failed to parse array value".to_string());
        }

        match chars.next() {
            Some(',') => (),
            Some(']') => return Ok(true),
            _ => return Err("Invalid JSON: expected comma or closing bracket after value".to_string())
        }
    }

    return Err("Invalid JSON: unexpected end of array".to_string());
}

Now, you should have a fully functioning JSON parser. You can test it with the steps outlined in the previous parts. Note that this parser is not fully comprehensive and may not handle cases such as hex numbers, fractions, exponents, escaped characters, etc. But it should be a good starting point for you to implement all of those features.

You can find the full source code for this project here.