My program reads txt file with Scanner and save each and every words in it in ArrayList, word by word, by using Scanner.next(). In here, any words that contains non-alphabetical letters should be ignored, meaning should not be counted as word at all(not replacing them). For example: "U2", "data-based", or "hello!" should not be counted at all.
I could make it to read all words and save it to ArrayList, but I am stuck with ignoring words containing non-letter element.
This is my partial code:
public static void main(String[] args) {
ArrayList<Word> wordList = new ArrayList<Word>();
int wordCount = 0;
Scanner input;
try {
System.out.println("Enter the file name with extension: ");
input = new Scanner(System.in);
File file = new File(input.nextLine());
input.close();
input = new Scanner(file);
while(input.hasNext())
{
Word w = new Word(input.next().toLowerCase()); //should be case-insensitive
if(!wordList.contains(w)) //equals method overriden in Word class
wordList.add(w);
else
{
wordList.get(wordList.indexOf(w)).addCount();
}
wordCount++;
}
input.close();
Word class is defined by me and is just a simple class with attributes of word(String) and count(int). equals() method was defined.
I think regex would be the solution for this, but since I am not sure how to define "non-alphabetical" in regex(I have no knowledge of regex) I can't define solid range..
Any help would be appreciated!
You can use the regex ^[a-zA-Z]*$
to match only alphabets. Use this before adding to your ArrayList
.
Now you can use the .matches()
of the String class to check if it contains only alphabets. An eg :
String str = "asd";
if (str.matches(^[a-zA-Z]*$)) {
// only alphabets
} else {
// something else
}