thinkapjava 5.1.2 documentation

Strings and things

«  Iteration and loops   ::   Contents   ::   Mutable objects  »

Strings and things

Characters

In Java and other object-oriented languages, objects are collections of related data that come with a set of methods. These methods operate on the objects, performing computations and sometimes modifying the object’s data.

Strings are objects, so you might ask “What is the data contained in a String object?” and “What are the methods we can invoke on String objects?” The components of a String object are letters or, more generally, characters. Not all characters are letters; some are numbers, symbols, and other things. For simplicity I call them all letters. There are many methods, but I use only a few in this book. The rest are documented here.

The first method we will look at is charAt, which allows you to extract letters from a String. char is the variable type that can store individual characters (as opposed to strings of them).

chars work just like the other types we have seen:

char ltr = 'c';
if (ltr == 'c') {
  System.out.println(ltr);
}

Character values appear in single quotes, like ’c’. Unlike string values (which appear in double quotes), character values can contain only a single letter or symbol.

Here’s how the charAt method is used:

String fruit = "banana";
char letter = fruit.charAt(1);
System.out.println(letter);

fruit.charAt() means that I am invoking the charAt method on the object named fruit. I am passing the argument 1 to this method, which means that I want to know the first letter of the string. The result is a character, which is stored in a char named letter. When I print the value of letter, I get a surprise:

a

a is not the first letter of "banana". Unless you are a computer scientist. For technical reasons, computer scientists start counting from zero. The 0th letter (“zeroeth”) of "banana" is b. The 1th letter (“oneth”) is a and the 2th (“twooth”) letter is n.

If you want the zereoth letter of a string, you have to pass 0 as an argument:

char letter = fruit.charAt(0);

Length

The next String method we’ll look at is length, which returns the number of characters in the string. For example:

int length = fruit.length();

length takes no arguments and returns an integer, in this case 6. Notice that it is legal to have a variable with the same name as a method (although it can be confusing for human readers).

To find the last letter of a string, you might be tempted to try something like

int length = fruit.length();
char last = fruit.charAt(length);       // WRONG!!

That won’t work. The reason is that there is no 6th letter in "banana". Since we started counting at 0, the 6 letters are numbered from 0 to 5. To get the last character, you have to subtract 1 from length.

int length = fruit.length();
char last = fruit.charAt(length-1);

Traversal

A common thing to do with a string is start at the beginning, select each character in turn, do some computation with it, and continue until the end. This pattern of processing is called a traversal. A natural way to encode a traversal is with a while statement:

int index = 0;
while (index < fruit.length()) {
    char letter = fruit.charAt(index);
    System.out.println(letter);
    index = index + 1;
}

This loop traverses the string and prints each letter on a line by itself. Notice that the condition is index < fruit.length(), which means that when index is equal to the length of the string, the condition is false and the body of the loop is not executed. The last character we access is the one with the index fruit.length()-1.

The name of the loop variable is index. An index is a variable or value used to specify a member of an ordered set, in this case the string of characters. The index indicates (hence the name) which one you want.

Run-time errors

Way back in Section Run-time errors I talked about run-time errors, which are errors that don’t appear until a program has started running. In Java run-time errors are called exceptions.

You probably haven’t seen many run-time errors, because we haven’t been doing many things that can cause one. Well, now we are. If you use the charAt method and provide an index that is negative or greater than length-1, it throws an exception. You can think of “throwing” an exception like throwing a tantrum.

When that happens, Java prints an error message with the type of exception and a stack trace, which shows the methods that were running when the exception occurred. Here is an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
public class BadString {

    public static void main(String[] args) {
        processWord("banana");
    }

    public static void processWord(String s) {
        char c = getLastLetter(s);
        System.out.println(c);
    }

    public static char getLastLetter(String s) {
        int index = s.length();         // WRONG!
        char c = s.charAt(index);
        return c;
    }
}

Notice the error in getLastLetter: the index of the last character should be s.length()-1. Here’s what you get:

Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
  String index out of range: 6
    at java.lang.String.charAt(String.java:658)
    at BadString.getLastLetter(BadStringjava:14)
    at BadString.processWord(BadStringjava:8)
    at BadString.main(BadStringjava:4)

Then the program ends. The stack trace can be hard to read, but it contains a lot of information. In particular, not the line numbers. The trace is telling you exactly where each function was executing when the exception occurred. You can see that main called processWord on line 4. Then processWord called getLastLetter on line 8. Then getLastLetter called charAt on line 14. Finally an exception was thrown on line 658 of the code for charAt.

Reading documentation

If you go to the API and click on charAt, you get the following documentation (or something like it):

public char charAt(int index)

Returns the char value at the specified index. An index ranges from 0 to length() - 1. The first char value of the sequence is at index 0, the next at index 1, and so on, as for array indexing.

Parameters: index - the index of the char value.

Returns: the char value at the specified index of this string.
The first char value is at index 0.
Throws: IndexOutOfBoundsException - if the index argument is
negative or not less than the length of this string.

The first line is the method’s prototype, which specifies the name of the method, the type of the parameters, and the return type.

The next line describes what the method does. The following lines explain the parameters and return values. In this case the explanations are redundant, but the documentation is supposed to fit a standard format. The last line describes the exceptions this method might throw.

It might take some time to get comfortable with this kind of documentation, but it is worth the effort.

The indexOf method

indexOf is the inverse of charAt: charAt takes an index and returns the character at that index; indexOf takes a character and finds the index where that character appears.

charAt fails if the index is out of range, and throws an exception. indexOf fails if the character does not appear in the string, and returns the value -1.

String fruit = "banana";
int index = fruit.indexOf('a');

This finds the index of the letter ’a’ in the string. In this case, the letter appears three times, so it is not obvious what indexOf should do. According to the documentation, it returns the index of the first appearance.

To find subsequent appearances, there is another version of indexOf. It takes a second argument that indicates where in the string to start looking. For an explanation of this kind of overloading, see Section Overloading.

If we invoke:

int index = fruit.indexOf('a', 2);

it starts at the twoeth letter (the first n) and finds the second a, which is at index 3. If the letter happens to appear at the starting index, the starting index is the answer. So

int index = fruit.indexOf('a', 5);

returns 5.

Looping and counting

The following program counts the number of times the letter ’a’ appears in a string:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
String fruit = "banana";
int length = fruit.length();
int count = 0;

int index = 0;
while (index < length) {
    if (fruit.charAt(index) == 'a') {
        count = count + 1;
    }
    index = index + 1;
}
System.out.println(count);

This program demonstrates a common idiom, called a counter. The variable count is initialized to zero and then incremented each time we find an ’a’. To increment is to increase by one; it is the opposite of decrement. When we exit the loop, count contains the result: the total number of a’s.

Increment and decrement operators

Incrementing and decrementing are such common operations that Java provides special operators for them. The ++ operator adds one to the current value of an int or char. –- subtracts one. Neither operator works on doubles, booleans or Strings.

Technically, it is legal to increment a variable and use it in an expression at the same time. For example, you might see something like:

System.out.println(i++);

Looking at this, it is not clear whether the increment will take effect before or after the value is printed. Because expressions like this tend to be confusing, I discourage you from using them. In fact, to discourage you even more, I’m not going to tell you what the result is. If you really want to know, you can try it.

Using the increment operators, we can rewrite the letter-counter:

1
2
3
4
5
6
7
int index = 0;
while (index < length) {
  if (fruit.charAt(index) == 'a') {
    count++;
  }
  index++;
}

It is a common error to write something like

index = index++;             // WRONG!!

Unfortunately, this is syntactically legal, so the compiler will not warn you. The effect of this statement is to leave the value of index unchanged. This is often a difficult bug to track down.

Remember, you can write index = index+1, or you can write index++, but you shouldn’t mix them.

Strings are immutable

As you read the documentation of the String methods, you might notice toUpperCase and toLowerCase. These methods are often a source of confusion, because it sounds like they have the effect of changing (or mutating) an existing string. Actually, neither these methods nor any others can change a string, because strings are immutable.

When you invoke toUpperCase on a String, you get a new String as a return value. For example:

String name = "Alan Turing";
String upperName = name.toUpperCase();

After the second line is executed, upperName contains the value "ALAN TURING", but name still contains "Alan Turing".

Strings are incomparable

It is often necessary to compare strings to see if they are the same, or to see which comes first in alphabetical order. It would be nice if we could use the comparison operators, like == and >, but we can’t.

To compare Strings, we have to use the equals and compareTo methods. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
String name1 = "Alan Turing";
String name2 = "Ada Lovelace";

if (name1.equals (name2)) {
  System.out.println("The names are the same.");
}

int flag = name1.compareTo (name2);
if (flag == 0) {
  System.out.println("The names are the same.");
} else if (flag < 0) {
  System.out.println("name1 comes before name2.");
} else if (flag > 0) {
  System.out.println("name2 comes before name1.");
}

The syntax here is a little weird. To compare two Strings, you have to invoke a method on one of them and pass the other as an argument.

The return value from equals is straightforward enough; true if the strings contain the same characters, and false otherwise.

The return value from compareTo is a weird, too. It is the difference between the first characters in the strings that differ. If the strings are equal, it is 0. If the first string (the one on which the method is invoked) comes first in the alphabet, the difference is negative. Otherwise, the difference is positive. In this case the return value is positive 8, because the second letter of “Ada” comes before the second letter of “Alan” by 8 letters.

Just for completeness, I should admit that it is legal, but very seldom correct, to use the == operator with Strings. I explain why in Section The sameCard method; for now, don’t do it.

Glossary

object:
A collection of related data that comes with a set of methods that operate on it. The objects we have used so far are Strings, Bugs, Rocks, and the other GridWorld objects.
index:
A variable or value used to select one of the members of an ordered set, like a character from a string.
exception:
A run-time error.
throw:
Cause an exception.
stack trace:
A report that shows the state of a program when an exception occurs.
prototype:
The first line of a method, which specifies the name, parameters and return type.
traverse:
To iterate through all the elements of a set performing a similar operation on each.
counter:
A variable used to count something, usually initialized to zero and then incremented.
increment:
Increase the value of a variable by one. The increment operator in Java is ++.
decrement:
Decrease the value of a variable by one. The decrement operator in Java is .

Exercises

Exercise

Write a method that takes a String as an argument and that prints the letters backwards all on one line.

Exercise

Read the stack trace in Section Run-time errors and answer these questions:

  • What kind of Exception occurred, and what package is it defined in?
  • What is the value of the index that caused the exception?
  • What method threw the exception, and where is that method defined?
  • What method invoked charAt?
  • In BadString.java, what is the line number where charAt was invoked?

Exercise

Encapsulate the code in Section Looping and counting in a method named countLetters, and generalize it so that it accepts the string and the letter as arguments.

Then rewrite the method so that it uses indexOf to locate the a’s, rather than checking the characters one by one.

Exercise

The purpose of this exercise is to review encapsulation and generalization.

  1. Encapsulate the following code fragment, transforming it into a method that takes a String as an argument and that returns the final value of count.
  2. In a sentence or two, describe what the resulting method does (without getting into the details of how).
  3. Now that you have generalized the code so that it works on any String, what could you do to generalize it more?
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
String s = "((3 + 7) * 2)";
int len = s.length();

int i = 0;
int count = 0;

while (i < len) {
    char c = s.charAt(i);

    if (c == '(') {
       count = count + 1;
    } else if (c == ')') {
       count = count - 1;
    }
    i = i + 1;
}

System.out.println(count);

Exercise

The point of this exercise is to explore Java types and fill in some of the details that aren’t covered in the chapter.

  1. Create a new program named Test.java and write a main method that contains expressions that combine various types using the + operator. For example, what happens when you “add” a String and a char? Does it perform addition or concatenation? What is the type of the result? (How can you determine the type of the result?)

  2. Make a bigger copy of the following table and fill it in. At the intersection of each pair of types, you should indicate whether it is legal to use the + operator with these types, what operation is performed (addition or concatenation), and what the type of the result is.

      boolean char int String
    boolean        
    char        
    int        
    String        
  3. Think about some of the choices the designers of Java made when they filled in this table. How many of the entries seem unavoidable, as if there were no other choice? How many seem like arbitrary choices from several equally reasonable possibilities? How many seem problematic?

  4. Here’s a puzzler: normally, the statement x++ is exactly equivalent to x = x + 1. But if x is a char, it’s not! In that case, x++ is legal, but x = x + 1 causes an error. Try it out and see what the error message is, then see if you can figure out what is going on.

Exercise

What is the output of this program? Describe in a sentence what mystery does (not how it works).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
public class Mystery {

    public static String mystery(String s) {
        int i = s.length() - 1;
        String total = "";

        while (i >= 0 ) {
            char ch = s.charAt(i);
            System.out.println(i + "     " + ch);

            total = total + ch;
            i--;
        }
        return total;
    }

    public static void main(String[] args) {
        System.out.println(mystery("Allen"));
    }
}

Exercise

A friend of yours shows you the following method and explains that if number is any two-digit number, the program will output the number backwards. He claims that if number is 17, the method will output 71.

Is he right? If not, explain what the program actually does and modify it so that it does the right thing.

int number = 17;
int lastDigit = number%10;
int firstDigit = number/10;
System.out.println(lastDigit + firstDigit);

Exercise

What is the output of the following program?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
public class Enigma {

    public static void enigma(int x) {
        if (x == 0) {
            return;
        } else {
            enigma(x/2);
        }

        System.out.print(x%2);
    }

    public static void main(String[] args) {
        enigma(5);
        System.out.println("");
    }
}

Explain in 4-5 words what the method enigma really does.

Exercise

  1. Create a new program named Palindrome.java.

  2. Write a method named first that takes a String and returns the first letter, and one named last that returns the last letter.

  3. Write a method named middle that takes a String and returns a substring that contains everything except the first and last characters.

    Hint: read the documentation of the substring method in the String class. Run a few tests to make sure you understand how substring works before you try to write middle.

    What happens if you invoke middle on a string that has only two letters? One letter? No letters?

  4. The usual definition of a palindrome is a word that reads the same both forward and backward, like “otto” and “palindromeemordnilap.” An alternative way to define a property like this is to specify a way of testing for the property. For example, we might say, “a single letter is a palindrome, and a two-letter word is a palindrome if the letters are the same, and any other word is a palindrome if the first letter is the same as the last and the middle is a palindrome.”

    Write a recursive method named isPalindrome that takes a String and that returns a boolean indicating whether the word is a palindrome or not.

  5. Once you have a working palindrome checker, look for ways to simplify it by reducing the number of conditions you check. Hint: it might be useful to adopt the definition that the empty string is a palindrome.

  6. On a piece of paper, figure out a strategy for checking palindromes iteratively. There are several possible approaches, so make sure you have a solid plan before you start writing code.

  7. Implement your strategy in a method called isPalindromeIter.

  8. Optional: Appendix Input and Output in Java provides code for reading a list of words from a file. Read a list of words and print the palindromes.

Exercise

A word is said to be “abecedarian” if the letters in the word appear in alphabetical order. For example, the following are all 6-letter English abecedarian words.

abdest, acknow, acorsy, adempt, adipsy, agnosy, befist, behint, beknow, bijoux, biopsy, cestuy, chintz, deflux, dehors, dehort, deinos, diluvy, dimpsy
  1. Describe a process for checking whether a given word (String) is abecedarian, assuming that the word contains only lower-case letters. Your process can be iterative or recursive.
  2. Implement your process in a method called isAbecedarian.

Exercise

A dupledrome is a word that contains only double letters, like “llaammaa” or “ssaabb”. I conjecture that there are no dupledromes in common English use. To test that conjecture, I would like a program that reads words from the dictionary one at a time and checks them for dupledromity.

Write a method called isDupledrome that takes a String and returns a boolean indicating whether the word is a dupledrome.

Exercise

  1. The Captain Crunch decoder ring works by taking each letter in a string and adding 13 to it. For example, ’a’ becomes ’n’ and ’b’ becomes ’o’. The letters “wrap around” at the end, so ’z’ becomes ’m’.

    Write a method that takes a String and that returns a new String containing the encoded version. You should assume that the String contains upper and lower case letters, and spaces, but no other punctuation. Lower case letters should be tranformed into other lower case letters; upper into upper. You should not encode the spaces.

  2. Generalize the Captain Crunch method so that instead of adding 13 to the letters, it adds any given amount. Now you should be able to encode things by adding 13 and decode them by adding -13. Try it.

Exercise

If you did the GridWorld exercises in Chapter GridWorld: Part 1, you might enjoy this exercise. The goal is to use trigonometry to get the Bugs to chase each other.

Make a copy of BugRunner.java named ChaseRunner.java and import it into your development environment. Before you change anything, check that you can compile and run it.

  • Create two Bugs, one red and one blue.

  • Write a method called distance that takes two Bugs and computes the distance between them. Remember that you can get the x-coordinate of a Bug like this:

    int x = bug.getLocation().getCol();
    
  • Write a method called turnToward that takes two Bugs and turns one to face the other. HINT: use Math.atan2, but remember that the result is in radians, so you have to convert to degrees. Also, for Bugs, 0 degress is North, not East.

  • Write a method called moveToward that takes two Bugs, turns the first to face the second, and then moves the first one, if it can.

  • Write a method called moveBugs that takes two Bugs and an integer n, and moves each Bug toward the other n times. You can write this method recursively, or use a while loop.

  • Test each of your methods as you develop them. When they are all working, look for opportunities to improve them. For example, if you have redundant code in distance and turnToward, you could encapsulate the repeated code in a method.

«  Iteration and loops   ::   Contents   ::   Mutable objects  »