Spring 2012 CSCI 220 Week 5

= String Data Type = Test is represented by the string data type. You can think of a string as a sequence of characters.

As you already know, you can specify strings using either double quotes or single quotes: >>> str1 = "Hello1" >>> str1 = "Hello" >>> str2 = 'spam' >>> print(str1, str2) Hello spam >>> type(str1)  >>> type(str2) 

You can get string input from the user by using the input function (i.e., do not evaluate the string): >>> firstName = input("Please enter your name: ") Please enter your name: Paul >>> print("Hello",firstName) Hello Paul

Access individual characters from the string
This is done through indexing into the string. This is done by referring to the characters by the position (i.e., 0, 1, 2, ...) >>> str1 = "Hello Paul" >>> str1[0] 'H' >>> str1[8] 'u' >>> str1[9] 'l' >>> str1[10] Traceback (most recent call last): File "", line 1, in    str1[10] IndexError: string index out of range

The general form for indexing is [ ]. It is common to combine string indexing and loops: for i in range(10): print(str1[i], end = " ")

Output: H e l l o  P a u l

Python also allows you to index strings from the right end of the string using negative indexes: >>> str1[-1] 'l' >>> str1[-2] 'u'

Indexing returns a single character from a string. It is also possible to access a contiguous sequence of characters or substring. This is accomplished by an operators known as slicing. The general form of slicing is [ : ]. Both start and end should be integer expressions. This operation returns a string starting at and running up to, but not including, position end. Example: >>> str1[0:4] 'Hell' >>> str1[0:5] 'Hello'

You can also leave out the and/or parameter: >>> str1[6:] 'Paul' >>> str1[4:] 'o Paul' >>> str1[:4] 'Hell'

How do I join two or more strings?
>>> "Paul" + "Edward" + "Anderson" 'PaulEdwardAnderson'

>>> str1 = "Paul" >>> str2 = "Edward" >>> str3 = "Anderson" >>> str1 + str2 + str3 'PaulEdwardAnderson'

How do I repeat the same string?
>>> "Paul" * 3 'PaulPaulPaul'

How I can loop through a string?
>>> for ch in "Spam!": print(ch, end=" ")

S p a m !

Or you can use the len function: >>> for ch in range(len("Spam!")): print(ch, end=" ")

0 1 2 3 4

Write a program that reads a person's name and computes the corresponding username.

 * 1) username.py
 * 2)    Simple string processing program to generate usernames.

def main: print("This program generates computer usernames.\n")

# get user's first and last names first = input("Please enter your first name (all lowercase): ") last = input("Please enter your last name (all lowercase): ")

# concatenate first initial with 7 chars of the last name. uname = first[0] + last[:7]

# output the username print("Your username is:",uname)

main

Example execution: This program generates computer usernames.

Please enter your first name (all lowercase): paul Please enter your last name (all lowercase): anderson Your username is: panderso

What happens when you username is shorter than 7 characters?

Answer: Nothing. last[:7] does not cause an error.

Print the abbreviation of the month given a month number (note: use slice operator)

 * 1) month.py
 * 2)  A program to print the abbreviation of a month, given its number

def main: # months is used as a lookup table months = "JanFebMarAprMayJunJulAugSepOctNovDec"

n = eval(input("Enter a month number (1-12): "))

# compute starting position of month n in months pos = (n-1) * 3

# Grab the appropriate slice from months monthAbbrev = months[pos:pos+3]

# print the result print("The month abbreviation is", monthAbbrev + ".")

main

Sample execution: Enter a month number (1-12): 3 The month abbreviation is Mar.

A problem with this solution is that we are forced to use three letter abbreviations. This can be accomplished using sequences.

= Lists as Sequences = We've already been using lists in some of our for loops: >>> for i in [0,2,4]: print(i) 0 2 4

A list is a sequence of elements. You can use lists in much the same way as a sequence of characters: >>> grades = ['A','B','C','D','F'] >>> grades[0] 'A' >>> grades[2:4] ['C', 'D'] >>> len(grades) 5 >>> for grade in grades: print(grade)

A B C D F

You can mix it up and create lists that contain both numbers and strings:

myList = [1, "Spam", 4.0, "U"]

Now we can update our month.py program to print the full name of the month.
 * 1) month2.py
 * 2)  A program to print the month abbreviation, given its number

def main: # months is a list used as a lookup table months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]

n = eval(input("Enter a month number (1-12): "))

print("The month is", months[n-1] + ".")

main

Example execution: Enter a month number (1-12): 3 The month is March.

Things to take notice: I've spread the months list declaration across two lines.

While both strings and lists are both sequences, there is an important difference between the two. Lists are mutable. Strings are immutable. >>> myList = [34, 26, 15, 10] >>> myList[2] 15 >>> myList[2] = 0 >>> myList [34, 26, 0, 10] >>> myString = "Hello World" >>> myString[2] 'l' >>> myString[2] = 'z' Traceback (most recent call last): File "", line 1, in    myString[2] = 'z' TypeError: 'str' object does not support item assignment

= String Representation and Message Encoding =

How do computers represent strings internally?
Each character is translated into a number, and the entire string is stored as a sequence of binary numbers in computer memory.

In the early days of computing, different designers and manufacturers used different encodings. You can imagine the headaches this caused.

To avoid this problem, computer systems today use industry standard encodings. One important standard is called ASCII (American Standard Code for Information Interchange). ASCII uses the numbers 0 through 127 to represent typical characters found on American keyboard. Computer systems are moving towards the support of the Unicode standard, which is an international standard. Python as well as most modern languages support Unicode.

To see what number is associated with a character in Python, you can ask Python for the ordinal (numeric value) of a character: >>> ord('a') 97 >>> ord('A') 65 >>> chr(65) 'A' >>> chr(97) 'a'

A single byte can store all of the ASCII encodings (2^8 == 256), but there are over 100,000 characters in the Unicode standard. Unicode uses more than a single byte in some cases.

Exercise
Create a program that prints all of the characters of a string using their ordinal (ASCII) encoding.

Extra string functions
>>> myList = "paul edward anderson".split(" ") >>> myList ['paul', 'edward', 'anderson'] >>> myList[0] 'paul' >>> myString = " ".join(myList) >>> len(myList) 3 >>> myString 'paul edward anderson' >>> ",".join(myList) 'paul,edward,anderson' >>> " and ".join(myList) 'paul and edward and anderson' >>> myString.upper 'PAUL EDWARD ANDERSON' >>> myString.title 'Paul Edward Anderson' >>> myString.find("a") 1 >>> myString 'paul edward anderson'

Exercise: Date Conversion
Input a date such as "05/24/2003," and the program will display the data as "May 24, 2003."


 * 1) dateconvert.py
 * 2)  Converts a date in form "mm/dd/yyyy" to "monday day, year"

def main: # get the date dateStr = input("Enter a date (mm/dd/yyyy): ")

# split into components monthStr, dayStr, yearStr = dateStr.split("/")

# convert monthStr to the month name months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"] monthName = months[int(monthStr)-1]

# output result in month day, year format print("The converted date is:",monthName, dayStr+",", yearStr)

main

Example execution: Enter a date (mm/dd/yyyy): 03/11/2001 The converted date is: March 11, 2001

Get current date and time in Python
import datetime

now = datetime.datetime.now

print print("Current date and time using str method of datetime object:") print(str(now))

print print("Current date and time using instance attributes:") print("Current year:",now.year) print("Current month:",now.month) print("Current day:",now.day) print("Current hour:",now.hour) print("Current minute:",now.minute) print("Current second:",now.second) print("Current microsecond:",now.microsecond)

print print("Current date and time using strftime:") print(now.strftime("%Y-%m-%d %H:%M"))

String formatting
How do we output a dollar amount? def main: amount = eval(input("Enter the amount: "))

print("$"+str(amount))

main

Example execution: Enter the amount: 1.50 $1.5

But what if we want two decimal points? def main: amount = eval(input("Enter the amount: "))

print("${0:0.2f}".format(amount))

main

Example execution: Enter the amount: 1.50 $1.50

The general format is .format, where  can contain templates that have the general form of { :}.

Examples: >>> "Hello {0} {1}, you may have won ${2}".format("Mr.","Smith",10000) 'Hello Mr. Smith, you may have won $10000' >>> "This int, {0:5}, was placed in a field of width 5".format(7) 'This int,    7, was placed in a field of width 5' >>> "This int, {0:10}, was placed in a field of width 7".format(7) 'This int,         7, was placed in a field of width 7' >>> "This float, {0:10.5}, has width 10 and precision 5".format(3.1415926) 'This float,    3.1416, has width 10 and precision 5' >>> "This float, {0:10.5f}, is fixed at 5 decimal places".format(3.1415926) 'This float,   3.14159, is fixed at 5 decimal places' >>> "This float, {0:0.5}, has width 0 and precision 5".format(3.1415926) 'This float, 3.1416, has width 0 and precision 5'

Justification examples: >>> "left justification: {0:<5}".format("Hi!") 'left justification: Hi! ' >>> "right justification: {0:>5}".format("Hi!") 'right justification:  Hi!' >>> "centered: {0:^5}".format("Hi!") 'centered: Hi! '

Exercise: Better Change Counter
Enter the number of quarters, dimes, nickels, and pennies. Then output the total value of your change.

format function and updated change calculator as exercise

= File Processing = One critical feature of an application is the ability to store and retrieve information from files on the disk.

Conceptually, a file is a sequence of data stored on secondary memory (usually a disk). Files can contain any data type, but the easiest files to work with contain text.

Files of text can be read and understood by humans. You can think of a text file as a long string that happens to be stored on disk. A special character that we've seen before in class is used to mark the end of lines: \n.

For example, a file that contains the following: Hello World

Goodbye 32

When stored to a file, you get this sequence of characters.

Hello\nWorld\n\nGoodbye 32

This is no different than how we've used newlines before in this class.

The exact details of file-processing differ substantially among programming languages. In fact, Python itself has multiple ways to accomplish the same file I/O. But there is a common pattern:
 * 1) Open the file by associating it with an object in our program
 * 2) Manipulate the file object (e.g., read, write, seek)
 * 3) Close the file, which is necessary to maintain correspondence between the file on disk and the file object. For example, changes you make to the file object might not show up on the disk until you close the file.

As you "edit the file," you are really making changes to data in memory, not the file itself. But Python will take care of modifying the file on the disk.

Working with text files in Python is easy. The first step is to create a file object corresponding to a file on a disk: = open

The mode is a string parameter that is either "r" or "w" depending on whether we intend to read from the file or write to the file.

An example: infile = open("numbers.dat","r")

Reading from a file
Python provides three related operations for reading information from a file:
 * .read
 * .readline
 * .readlines

Here is an example program that reads the entire contents of a file:
 * 1) printfile.py
 * 2)    Prints a file to the screen.

def main: fname = input("Enter filename: ") infile = open(fname,"r") data = infile.read print(data)

main

The readline operation can be used to read the next line from a file. Successive calls to readline get successive lines from the file. These lines will include the newline character. Example:

infile = open("Some_file.txt","r") for i in range(5): line = infile.readline print(line)

We can remove the newline by slicing it off: line[:-1]

We can also loop through all of the lines in file using: infile = open("Some_file.txt","r") for line in infile.readlines: # process the line here infile.close

A potential drawback of this approach is that all of the lines of the file are read at once. This can be a problem for very large files, which take up too much RAM. There is a simple alternative:

infile = open("Some_file.txt","r") for line in infile: # process the line here infile.close

Writing to a file
Opening a file for writing prepares that file to receive data. If no file with the name exists, it will be created. If a file with the name does exist, then Python will delete it and create a new, empty file.

outfile = open("mydata.out","w")

The easiest way to write information into a text file is to use the already familiar print function: print(..., file=)