Tokenizer

java.lang.Object
- org.stekikun.dolmen.debug.Tokenizer

```
public final class Tokenizer
extends Object
```
This class provides utility methods to help debug lexical analyzers which have been generated with Dolmen. They give various ways to use a lexical analyzer to completely tokenize some input, and record the various tokens returned.
These methods are both useful for quick-and-dirty testing of a lexical analyzer, e.g. manually typing test input sequences and checking the analyzer behaves as expected, as well as for testing complete files and recording the precise results of the lexical analysis, which can be very useful as a non-regression tool in the continuous integration of a project.
In particular, the input positions associated to tokens can be retrieved as well, as they are usually important for error reporting and used in parsers being built on top of the lexical analyzers as well. They are usually hard to thoroughly check and debug, in particular as subtle changes to a lexer description can break locations without affecting the behaviour of the analyzer otherwise.

Author:

Stéphane Lescuyer

See Also:

tokenize(LexerInterface, String, Reader, Writer, boolean), prompt(LexerInterface, boolean), file(LexerInterface, File, File, boolean)

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static interface`	`Tokenizer.LexerInterface<L extends LexBuffer,T>` This interface acts as a generic proxy to using a Dolmen-generated lexer in the static debugging functions provided in `Tokenizer`.

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static <L extends LexBuffer,T> void`	`file(Tokenizer.LexerInterface<L,T> lexer, File input, File output, boolean positions)` Uses the given `lexer` interface to tokenize the contents of the file `input`, and stores the result in the `output` file.
`static <L extends LexBuffer,T> void`	`prompt(Tokenizer.LexerInterface<L,T> lexer, boolean positions)` This method can be used to conveniently test a lexical analyzer against various one-line sentences entered manually or fed from a test file.
`static <L extends LexBuffer,T> void`	`tokenize(Tokenizer.LexerInterface<L,T> lexer, String inputName, Reader reader, Writer writer, boolean positions)` Initializes a lexical analyzer with the given input stream, based on the `lexer` interface, and repeatedly consumes tokens from the input until the halting condition in `lexer` is met.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - tokenize
```
public static <L extends LexBuffer,T> void tokenize(Tokenizer.LexerInterface<L,T> lexer,
                                                    String inputName,
                                                    Reader reader,
                                                    Writer writer,
                                                    boolean positions)
```
    Initializes a lexical analyzer with the given input stream, based on the lexer interface, and repeatedly consumes tokens from the input until the halting condition in lexer is met. The tokens are displayed, one per line, using the given writer. Optionally, the start and end positions of each token can be displayed as well along the token.
    Potential lexical and IO errors are caught and displayed, and abort the tokenization process. This method does not attempt to close the given reader/writer streams, this should be handled by the caller as necessary.
    
    Parameters:
    
    lexer - an interface to the lexical analyzer to use
    
    inputName - a user-friendly name describing the input
    
    reader - character stream to feed the lexer with
    
    writer - character stream to write the tokens to
    
    positions - whether token locations are displayed as well
  - prompt
```
public static <L extends LexBuffer,T> void prompt(Tokenizer.LexerInterface<L,T> lexer,
                                                  boolean positions)
```
    This method can be used to conveniently test a lexical analyzer against various one-line sentences entered manually or fed from a test file. It reads one line from the standard input at a time and tokenizes it using the given lexer, as described by tokenize(LexerInterface, LexBuffer, Writer, boolean). In response, the tokens are displayed on standard output, one per line. Optionally, the start and end positions of each token can be displayed as well along the token.
    Potential lexical and IO errors are caught and displayed, and handling of the subsequent lines on standard input resumes normally. The method stops when encountering end-of-input or a totally empty line. Of course, this method is not suitable to test sentences which themselves contain line breaks.
    
    Parameters:
    
    lexer - an interface to the lexical analyzer to use
    
    positions - whether token locations are displayed as well
  - file
```
public static <L extends LexBuffer,T> void file(Tokenizer.LexerInterface<L,T> lexer,
                                                File input,
                                                File output,
                                                boolean positions)
```
    Uses the given lexer interface to tokenize the contents of the file input, and stores the result in the output file. The tokenization process repeatedly consumes tokens from the input until the halting condition in lexer is met. The tokens are displayed, one per line, in the output. Optionally, the start and end positions of each token can be displayed as well along the token.
    Potential lexical and IO errors are caught and displayed on standard output, and abort the tokenization process.
    
    Parameters:
    
    lexer - an interface to the lexical analyzer to use
    
    positions - whether token locations are displayed as well

Class Tokenizer

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Detail

tokenize

prompt

file