-
Nicholas Tindle authored
- Create Token class to represent individual tokens in the markup language - Create tokenize function to split the markup into a list of tokens - Implement Parser class to parse the tokens and generate a dictionary or list - Add support for nested tags and multiple root tags - Handle whitespace and escaped characters correctly - Write unit tests to verify the functionality of the tokenizer and parser This commit implements the tokenizer and parser for an HTML-like markup language. It includes the following changes: - Added token.py and parser.py to the gravitasml package - Implemented the Token class with attributes for type, value, line_num, and column - Implemented the tokenize function to split the markup into a list of tokens based on regular expressions - Implemented the Parser class with methods for parsing the tokens and generating a dictionary or list - Added support for nested tags and multiple root tags in the parsed output - Handled whitespace and escaped characters correctly in the tokenization process - Created unit tests for both the tokenizer and parser to verify their functionality This implementation allows for the parsing of HTML-like markup into a structured dictionary or list, providing the foundation for further processing and manipulation of the markup data.
Loading