You can read text token by token.
StreamTokenizer from java.io package breaks character-based stream into tokens.
To distinguish between tokens based on their types and comments, use StreamTokenizer class.
The code uses a StringReader object as the data source.
You can use a FileReader object or any other Reader object as the data source.
The nextToken() method of StreamTokenizer is called repeatedly.
It populates three fields of the StreamTokenizer object: ttype, sval, and nval.
The ttype field indicates the token type that was read.
The following are the four possible values for the ttype field:
Value | Description |
---|---|
TT_EOF | End of the stream has been reached. |
TT_EOL | End of line has been reached. |
TT_WORD | A word (a string) has been read as a token from the stream. |
TT_NUMBER | A number has been read as a token from the stream. |
If the ttype has TT_WORD, the string value is stored in its field sval.
If it returns TT_NUBMER, its number value is stored in nval field.
import static java.io.StreamTokenizer.TT_EOF; import static java.io.StreamTokenizer.TT_NUMBER; import static java.io.StreamTokenizer.TT_WORD; import java.io.IOException; import java.io.StreamTokenizer; import java.io.StringReader; public class Main { public static void main(String[] args) throws Exception { String str = "This is a test from book 2s.com, `123$%%.89 which is simple 50. 9697 &(&*7"; StringReader sr = new StringReader(str); StreamTokenizer st = new StreamTokenizer(sr); try {//from www . ja v a 2 s .co m while (st.nextToken() != TT_EOF) { switch (st.ttype) { case TT_WORD: /* a word has been read */ System.out.println("String value: " + st.sval); break; case TT_NUMBER: /* a number has been read */ System.out.println("Number value: " + st.nval); break; } } } catch (IOException e) { e.printStackTrace(); } } }