Wednesday, September 4, 2024

3 ways to Count words in Java String - Google Interview Questions with Solution

Today, I am going to share with you Java interview questions from Google, which were asked to one of my readers during the telephonic round. How do you count the number of words in a given String in Java? You can count words in Java String by using the split() method of String. A word is nothing but a non-space character in String, which is separated by one or multiple spaces. By using a regular expression to find spaces and split on them will give you an array of all words in a given String. This was the easy way to solve this problem as shown here, but if you have been asked to write a program to count a number of words in a given String in Java without using any of String utility methods like String.split() or StringTokenizer then it's a little bit challenging for a beginner programmer.

It's actually one of the common Java coding questions and I have seen it a couple of times with Java developer interviews of 2 to 4 years of experience, not just with Google but companies like Barclays, Nomura, Citibank, JP Morgan, and Amazon.

The interviewer also put additional constraints like split() is not allowed, you can only use basic methods like charAt(), length(), and substring() along with the loop, operators, and other basic programming tools.

In this article, I'll share all three ways to solve this problem i.e. first by using String's split() method and regular expression, second by using StringTokenizer, and third without using any library method like above.

The third one is the most interesting and very difficult to write a complete solution handling all special characters e.g. non-printable ASCII characters. 

For our purpose, we assume that space character includes tab, space, or newline and anything which is considered as a letter by Character.isLetter() is considered as a word.

Btw, if you are looking for more String based coding problems, you can either check here, or you can check out Data Structures and Algorithms: Deep Dive Using Java course on Udemy.  It not only provides is a collection of common programming questions and solutions from tech giants like Amazon, Google, Facebook, and Microsoft but also useful to refresh your Data Structure and algorithm skills.



How to Count Number of Words in Given String in Java?

Without wasting any more of your time, here are my three solutions to this common coding problem. In the first method we will use the split() method of String class, in the second we will use StringTokenizer, and in the third, we'll solve this problem without using any API or library method.

Solution 1 - Counting words using String.split() method

In this solution, we will use the split() method of java.lang.String class to count the number of words in a given sentence. 

This solution uses the regular expression "\\s+" to split the string on whitespace. The split method returns an array, the length of the array is your number of words in a given String.

 public static int countWordsUsingSplit(String input) {
    if (input == null || input.isEmpty()) {
      return 0;
    }

    String[] words = input.split("\\s+");
    return words.length;
  }

If you are new to a regular expression in Java, the \s is a character class to detect space including tabs, since \ needs to be escaped in Java, it becomes \\s and because there could be multiple spaces between words we made this regular expression greedy by adding +, hence \\s+ will find one more space and split the String accordingly.

You can also see The Complete Java MasterClass course by Tim Buchalaka on Udemy to learn more about the split() method of String class and regular expression in Java. This is also the simplest way to count the number of words in a given sentence.

3 ways to Count words in Java String - Google Interview Questions



Solution 2 - Counting word in String using StringTokenizer

Constructs a string tokenizer for the specified string. The tokenizer uses the default delimiter set, which is " \t\n\r\f": the space character, the tab character, the newline character, the carriage-return character, and the form-feed character. Delimiter characters themselves will not be treated as tokens


public static int countWordsUsingStringTokenizer(String sentence) {
    if (sentence == null || sentence.isEmpty()) {
      return 0;
    }
    StringTokenizer tokens = new StringTokenizer(sentence);
    return tokens.countTokens();
  }

You can see that we have not given any explicit delimiter to StringTokenizer, it uses the default set of delimiters which is enough to find any whitespace and since words are separated by whitespace, the number of tokens is actually equal to the number of words in given String. 

See Java Fundamentals Part 1 and 2 on Pluralsight for more information on the StringTokenizer class in Java. You need a membership, but even if you don't have you can use their 10-day free pass to access this course for free.



Solution 3 - Counting word in String without using library method

Here is the code to count a number of words in a given string without using any library or utility method. This is what you may have written in C or C++. It iterates through the String array and checks every character.

It assumes that a word starts with a letter and ends with something which is not a letter. Once it encounters a non-letter it increments the counter and starts searching again from the next position.

 public static int count(String word) {
    if (word == null || word.isEmpty()) {
      return 0;
    }

    int wordCount = 0;

    boolean isWord = false;
    int endOfLine = word.length() - 1;
    char[] characters = word.toCharArray();

    for (int i = 0; i < characters.length; i++) {

      // if the char is a letter, word = true.
      if (Character.isLetter(characters[i]) && i != endOfLine) {
        isWord = true;

        // if char isn't a letter and there have been letters before,
        // counter goes up.
      } else if (!Character.isLetter(characters[i]) && isWord) {
        wordCount++;
        isWord = false;

        // last word of String; if it doesn't end with a non letter, it
        // wouldn't count without this.
      } else if (Character.isLetter(characters[i]) && i == endOfLine) {
        wordCount++;
      }
    }

    return wordCount;
  }
This code is complete and we will see the method live in action in the next paragraph where we'll combine all three solutions in a single Java program and test. 


Java Program to count a number of words in String

Here is our complete Java program to count a number of words in a given String sentence. It demonstrates all three examples we have seen so far like using the String.split() method, using StringTokenizer, and writing your own method to count the number of words without using any third-party library e.g. Google Guava or Apache Commons.


import java.util.StringTokenizer;

/*
 * Java Program to count number of words in String.
 * This program solves the problem in three ways,
 * by using String.split(), StringTokenizer, and
 * without any of them by just writing own logic
 */
public class Main {

  public static void main(String[] args) {

    String[] testdata = { "", null, "One", "O", "Java and C++", "a b c",
        "YouAre,best" };

    for (String input : testdata) {
      System.out.printf(
          "Number of words in stirng '%s' using split() is : %d %n", input,
          countWordsUsingSplit(input));
      System.out.printf(
          "Number of words in stirng '%s' using StringTokenizer is : %d %n",
          input, countWordsUsingStringTokenizer(input));
      System.out.printf("Number of words in stirng '%s' is : %d %n", input,
          count(input));
    }

  }

  /**
   * Count number of words in given String using split() and regular expression
   * 
   * @param input
   * @return number of words
   */
  public static int countWordsUsingSplit(String input) {
    if (input == null || input.isEmpty()) {
      return 0;
    }

    String[] words = input.split("\\s+");
    return words.length;
  }

  /**
   * Count number of words in given String using StirngTokenizer
   * 
   * @param sentence
   * @return count of words
   */
  public static int countWordsUsingStringTokenizer(String sentence) {
    if (sentence == null || sentence.isEmpty()) {
      return 0;
    }
    StringTokenizer tokens = new StringTokenizer(sentence);
    return tokens.countTokens();
  }

  /**
   * Count number of words in given String without split() or any other utility
   * method
   * 
   * @param word
   * @return number of words separated by space
   */
  public static int count(String word) {
    if (word == null || word.isEmpty()) {
      return 0;
    }

    int wordCount = 0;

    boolean isWord = false;
    int endOfLine = word.length() - 1;
    char[] characters = word.toCharArray();

    for (int i = 0; i < characters.length; i++) {

      // if the char is a letter, word = true.
      if (Character.isLetter(characters[i]) && i != endOfLine) {
        isWord = true;

        // if char isn't a letter and there have been letters before,
        // counter goes up.
      } else if (!Character.isLetter(characters[i]) && isWord) {
        wordCount++;
        isWord = false;

        // last word of String; if it doesn't end with a non letter, it
        // wouldn't count without this.
      } else if (Character.isLetter(characters[i]) && i == endOfLine) {
        wordCount++;
      }
    }

    return wordCount;
  }

}

Output
Number of words in string '' using split() is : 0 
Number of words in string '' using StringTokenizer is : 0 
Number of words in string '' is : 0 
Number of words in string 'null' using split() is : 0 
Number of words in string 'null' using StringTokenizer is : 0 
Number of words in string 'null' is : 0 
Number of words in string 'One' using split() is : 1 
Number of words in string 'One' using StringTokenizer is : 1 
Number of words in string 'One' is : 1 
Number of words in string 'O' using split() is : 1 
Number of words in string 'O' using StringTokenizer is : 1 
Number of words in string 'O' is : 1 
Number of words in string 'Java and C++' using split() is : 3 
Number of words in string 'Java and C++' using StringTokenizer is : 3 
Number of words in string 'Java and C++' is : 3 
Number of words in string 'a b c' using split() is : 3 
Number of words in string 'a b c' using StringTokenizer is : 3 
Number of words in string 'a b c' is : 3 
Number of words in string 'YouAre,best' using split() is : 1 
Number of words in string 'YouAre,best' using StringTokenizer is : 1 
Number of words in string 'YouAre,best' is : 2 

You can see that our program is working fine and it can correctly identify a number of words in a given String. 

If you want to practice some more of this type of question, you can also check the Cracking the Coding Interview book, one of the biggest collections of Programming Questions, and Solutions from technical interviews. It also includes questions from service-based companies like Infosys, TCS, and Cognizant.

3 ways to count words in Java String



That's all about how to count a number of words in Java String. I have shown you three ways to solve this problem, first by using the split() method and regular expression, second by using StringTokenizer class, and third without using any library method to solve this problem directly like split or StringTokenizer.

Depending upon your need, you can use any of these methods. The interviewer usually asks you to do it in a third way, so be ready for that. You can also check out the following resources and coding problems to gain more practice and confidence.


Other String based coding problems you may like to solve
  • How to reverse a String in place in Java? (solution)
  • How to find all permutations of a given String in Java? (solution)
  • How to check if a given number is prime or not? (solution)
  • 10 Free Courses to learn Data Structure and Algorithms (courses)
  • How to find the highest occurring word from a text file in Java? (solution)
  • 100+ Data Structure and Algorithms Problems (solved)
  • 10 Books to learn Data Structure and Algorithms (books)
  • How to check if two given Strings are Anagram in Java? (solution)
  • 101 Coding Problems and a few tips for Tech interviews (tips)
  • How to check if the given string is palindrome or not in Java? (solution)
  • How to remove duplicate characters from String in Java? (solution)
  • 10 Data Structure and Algorithms course to crack coding interview (courses)
  • How to check if a String contains duplicate characters in Java? (solution)
  • How to find the highest occurring word from a given file in Java? (solution)
  • How to count vowels and consonants in a given String in Java? (solution)
  • 21 String coding Problems from Technical Interviews (questions)
  • How to reverse words in a given String in Java? (solution)

Thanks for reading this article so far. If you find this String-based Java coding problem from Google and my explanation useful then please share it with your friends and colleagues. If you have any questions or feedback then please drop a note.

P. S. - If you are preparing for a programming job interview, then you must prepare for an all-important topic like data structure, String, array, etc. One course which can help you with this task is the Grokking the Coding Interview: Patterns for Coding Questions course on Educative. It contains popular coding interview patterns which will help you to solve most of the problems in your coding interviews.



22 comments:

  1. we can also use direct method str.length();

    ReplyDelete
    Replies
    1. I tried that, and that method lists all the String's length, not counting the words. For example, "I am a cat." would give str.length() of 11, instead of counting 4 words.

      Delete
    2. make it counting space instead of letters

      Delete
    3. Tried it, but if you input only space, result will be incorrect.

      Delete
  2. I have a 1 line short code for the 1st approach of using split() method here-

    int wordLength=str.split("\\s").length;

    Here 'str' is The Provided string
    Just declare the variable wordLength to store the length

    ReplyDelete
    Replies
    1. class A{
      public static void main(String[] args)
      {
      String s="hello kapil";

      int wordLength=s.split("\\s").length;

      System.out.println(wordLength);
      }
      }




      i tried this code and work fine thank brother

      Delete
  3. how we can count words which are in digits form in a sentence in java

    ReplyDelete
    Replies
    1. Do you have an example like input and expected output?

      Delete
  4. i want to only three letters of my words. how?

    ReplyDelete
    Replies
    1. What do you mean? can you elaborate with an example?

      Delete
  5. why did you -1 in "int endOfLine = word.length() - 1;" ?

    ReplyDelete
    Replies
    1. because array index starts at zero, so max will be length - 1. For example, if there are 6 characters in word then length will return 6 but last character would be on 5th index

      Delete
  6. Arrays.stream(str.split("\\s+)).count()

    ReplyDelete
  7. Big thnx for your lesson. I have one question.
    In the last row of sout the count() output is 2 and this is the only one difference from the output of two counting methods before. Why does it happened?

    ReplyDelete
  8. Big thnx for your lesson. I have one question.
    In the last row of sout the count() output is 2 and this is the only one difference from the output of two counting methods before. Why does it happened?

    ReplyDelete
  9. Hello friend, The issue with the code of third method is that it is not handling the case where non-letter characters are part of the same word. In the given example 'YouAre,best', the method treats 'YouAre' and 'best' as two separate words because the comma ',' is considered a non-letter character.

    To fix this issue, you can modify the code to consider non-letter characters as part of the word.

    Here's an updated version of the code:

    public static int count(String word) {
    if (word == null || word.isEmpty()) {
    return 0;
    }

    int wordCount = 0;
    boolean isWord = false;
    int endOfLine = word.length() - 1;
    char[] characters = word.toCharArray();

    for (int i = 0; i < characters.length; i++) {
    // if the char is a letter, word = true.
    if (Character.isLetter(characters[i]) && i != endOfLine) {
    isWord = true;
    } else if (!Character.isLetter(characters[i]) && isWord) {
    wordCount++;
    isWord = false;
    } else if (Character.isLetter(characters[i]) && i == endOfLine) {
    wordCount++;
    }
    }

    return wordCount;
    }

    Now, this updated code should correctly count 'YouAre,best' as a single word and output 1. I hope this clarify your doubt, if you have any questions feel free to ask. Once again thanks for you comment and bringing this to my attention.

    ReplyDelete
  10. Big thanks for your describing, I'm new in java and your detailing solution with examples help me to better understand this theme.

    ReplyDelete
  11. Unfortunately, the updating code is the same code that compiles the same output in last row. I have copied the code to idea and got the sam result, the last row counts 2 words instead 1. I will try to understand by myself to edit the code as I want to get in the end. Or if you show edited code, I will really glad to see. Sorry if I disturb you with my questions.

    ReplyDelete
    Replies
    1. Hello Goodislav, my bad, try this code, it should return "1"
      public class MyClass {
      public static void main(String args[]) {


      System.out.println(countWords("YouAre,best"));
      }

      public static int countWords(String sentence) {
      if (sentence == null || sentence.isEmpty()) {
      return 0;
      }

      int wordCount = 0;
      boolean isWord = false;

      for (int i = 0; i < sentence.length(); i++) {
      char currentChar = sentence.charAt(i);

      if (Character.isLetter(currentChar)) {
      isWord = true;

      // If it's the last character and a letter, increment word count
      if (i == sentence.length() - 1) {
      wordCount++;
      }
      } else if (Character.isWhitespace(currentChar) && isWord) {
      wordCount++;
      isWord = false;
      }
      }

      return wordCount;
      }


      }

      Delete
    2. Here we are explicitly checking for whitespace to avoid any ambiguity.

      Delete
  12. Thank you very much, Javin Paul. Everything is working now in all the methods.

    ReplyDelete

Feel free to comment, ask questions if you have any doubt.