<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    weidagang2046的專欄

    物格而后知致
    隨筆 - 8, 文章 - 409, 評論 - 101, 引用 - 0
    數(shù)據(jù)加載中……

    String Manipulation

    Programmers need to know how to manipulate strings for a variety of purposes, regardless of the programming language they are working in. This article will explain the various methods used to manipulate strings in Python.Introduction

    String manipulation is very useful and very widely used in every language. Often, programmers are required to break down strings and examine them closely. For example, in my articles on IRC (http://www.devshed.com/c/a/Python/Python-and-IRC/ and http://www.devshed.com/c/a/Python/Basic-IRC-Tasks/), I used the split method to break down commands to make working with them easier. In my article on sockets (http://www.devshed.com/c/a/Python/Sockets-in-Python/), I used regular expressions to look through a website and extract a currency exchange rate.

    This article will take a look at the various methods of manipulating strings, covering things from basic methods to regular expressions in Python. String manipulation is a skill that every Python programmer should be familiar with.

    String Methods

    The most basic way to manipulate strings is through the methods that are build into them. We can perform a limited number of tasks to strings through these methods. Open up the Python interactive interpreter. Let's create a string and play around with it a bit.

    >>> test = 'This is just a simple string.'

    Let's take a fast detour and use the len function. It can be used to find the length of a string. I'm not sure why it's a function rather than a method, but that's a whole nother issue:

    >>> len ( test )
    29

    All right, now let's get back to those methods I was talking about. Let's take our string and replace a word using the replace method:

    >>> test = test.replace ( 'simple', 'short' )
    >>> testa
    'This is just a short string.'

    Now let's count the number of times a given word, or, in this case, character, appears in a string:

    >>> test.count ( 'r' )
    2

    We can find characters or words, too:

    >>> test.find ( 'r' )
    18
    >>> test [ 18 ]
    'r'

    Splitting a string is something I find myself doing often. The split method is used for this:

    >>> test.split()
    ['This', 'is', 'just', 'a', 'short', 'string.']

    We can choose the point that we split it at:

    >>> test.split ( 'a' )
    ['This is just ', ' short string.']

    Rejoining our split string can be done using the join method:

    >>> ' some '.join ( test.split ( 'a' ) )
    'This is just  some  short string.'

    We can play around with the case of letters in our string, too. Let's make it all upper case:

    >>> test.upper()
    'THIS IS JUST A SHORT STRING.'

    Now let's make it lowercase:

    >>> test.lower()
    'this is just a short string.'

    Let's capitalize only the first letter of the lowercase string:

    >>> test.lower().capitalize()
    'This is just a short string.'

    We can also use the title method. This capitalizes the first letter in each word:

    >>> test.title()
    'This Is Just A Short String.'

    Trading case is possible:

    >>> test.swapcase()
    'tHIS IS JUST A SHORT STRING.'

    We can run a number of tests on strings using a few methods. Let's check to see whether a given string is all upper case:

    >>> 'UPPER'.isupper()
    True
    >>> 'UpPEr'.isupper()
    False

    Likewise, we can check to see whether a string contains only lower case characters:

    >>> 'lower'.islower()
    True
    >>> 'Lower'.islower()
    False

    Checking whether a string looks like a title is simple, too:

    >>> 'This Is A Title'.istitle()
    True
    >>> 'This is A title'.istitle()
    False

    We can check whether a string is alphanumeric:

    >>> 'aa44'.isalnum()
    True
    >>> 'a$44'.isalnum()
    False

    It is also possible to check whether a string contains only letters:

    >>> 'letters'.isalpha()
    True
    >>> 'letters4'.isalpha()
    False

    Here's how you check whether a string contains only numbers:

    >>> '306090'.isdigit()
    True
    >>> '30-60-90 Triangle'.isdigit()
    False

    We can also check whether a string only contains spaces:

    >>> '   '.isspace()
    True
    >>> ''.isspace()
    False

    Speaking of spaces, we can add spaces on either side of a string. Let's add spaces to the right of a string:

    >>> 'A string.'.ljust ( 15 )
    'A string.      '

    To add spaces to the left of a string, the rjust method is used:

    >>> 'A string.'.rjust ( 15 )
    '      A string.'

    The center method is used to center a string in spaces:

    >>> 'A string.'.center ( 15 )
    '   A string.   '

    We can strip spaces on either side of a string:

    >>> 'String.'.rjust ( 15 ).strip()
    'String.'
    >>> 'String.'.ljust ( 15 ).rstrip()
    'String.'

    Regular expressions are a very powerful tool in any language. They allow patterns to be matched against strings. Actions such as replacement can be performed on the string if the regular expression pattern matches. Python's module for regular expressions is the re module. Open the Python interactive interpreter, and let's take a closer look at regular expressions and the re module:

    >>> import re

    Let's create a simple string we can use to play around with:

    >>> test = 'This is for testing regular expressions in Python.'

    I spoke of matching special patterns with regular expressions, but let's start with matching a simple string just to get used to regular expressions. There are two methods for matching patterns in strings in the re module: search and match. Let's take a look at search first. It works like this:

    >>> result = re.search ( 'This', test )

    We can extract the results using the group method:

    >>> result.group ( 0 )
    'This'

    You're probably wondering about the group method right now and why we pass zero to it. It's simple, and I'll explain. You see, patterns can be organized into groups, like this:

    >>> result = re.search ( '(Th)(is)', test )

    There are two groups surrounded by parenthesis. We can extract them using the group method:

    >>> result.group ( 1 )
    'Th'
    >>> result.group ( 2 )
    'is'

    Passing zero to the method returns both of the groups:

    >>> result.group ( 0 )
    'This'

    The benefit of groups will become more clear once we work our way into actual patterns. First, though, let's take a look at the match function. It works similarly, but there is a crucial difference:

    >>> result =  re.match ( 'This', test )
    >>> print result
    <_sre.SRE_Match object at 0x00994250>
    >>> print result.group ( 0 )
    'This'
    >>> result = re.match ( 'regular', test )
    >>> print result
    None

    Notice that None was returned, even though “regular” is in the string. If you haven't figured it out, the match method matches patterns at the beginning of the string, and the search function examines the whole string. You might be wondering if it's possible, then, to make the match method match “regular,” since it's not at the beginning of the string. The answer is yes. It's possible to match it, and that brings us into patterns.

    The character “.” will match any character. We can get the match method to match “regular” by putting a period for every letter before it. Let's split this up into two groups as well. One will contain the periods, and one will contain “regular”:

    >>> result = re.match ( '(....................)(regular)', test )
    >>> result.group ( 0 )
    'This is for testing regular'
    >>> result.group ( 1 )
    'This is for testing '
    >>> result.group ( 2 )
    'regular'

    Aha! We matched it! However, it's ridiculous to have to type in all those periods. The good news is that we don't have to do that. Take a look at this and remember that there are twenty characters before “regular”:

    >>> result = re.match ( '(.{20})(regular)', test )
    >>> result.group ( 0 )
    'This is for testing regular'
    >>> result.group ( 1 )
    'This is for testing '
    >>> result.group ( 2 )
    'regular'

    That's a lot easier. Now let's look at a few more patterns. Here's how you can use brackets in a more advanced way:

    >>> result = re.match ( '(.{10,20})(regular)', test )
    >>> result.group ( 0 )
    'This is for testing regular'
    >>> result = re.match ( '(.{10,20})(testing)', test )
    'This is for testing'

    By entering two arguments, so to speak, you can match any number of characters in a range. In this case, that range is 10-20. Sometimes, however, this can cause undesired behavior. Take a look at this string:

    >>> anotherTest = 'a cat, a dog, a goat, a person'

    Let's match a range of characters:

    >>> result = re.match ( '(.{5,20})(,)', anotherTest )
    >>> result.group ( 1 )
    'a cat, a dog, a goat'

    What if we only want “a cat” though? This can be done with appending “?” to the end of the brackets:

    >>> result = re.match ( '(.{5,20}?)(,)', anotherTest )
    >>> result.group ( 1 )
    'a cat'

    Appending a question mark to something makes it match as few characters as possible. A question mark that does that, though, is not to be confused with this pattern:

    >>> anotherTest = '012345'
    >>> result = re.match ( '01?', anotherTest )
    >>> result.group ( 0 )
    '01'
    >>> result = re.match ( '0123456?', anotherTest )
    >>> result.group ( 0 )
    '012345'

    As you can see with the example, the character before a question mark is optional. Next is the “*” pattern. It matches one or more of the characters it follows, like this:

    >>> anotherTest = 'Just a silly string.'
    >>> result = re.match ( '(.*)(a)(.*)(string)', anotherTest )
    >>> result.group ( 0 )
    'Just a silly string'

    However, take a look at this:

    >>> anotherTest = 'Just a silly string. A very silly string.'
    >>> result = re.match ( '(.*)(a)(.*)(string)', anotherTest )
    >>> result.group ( 0 )
    'Just a silly string. A very silly string'

    What if, however, we want to only match the first sentence? If you've been following along closely, you'll know that “?” will, again, do the trick:

    >>> result = re.match ( '(.*?)(a)(.*?)(string)', anotherTest )
    >>> result.group ( 0 )
    'Just a silly string'

    As I mentioned earlier, though, “*” doesn't have to match anything:

    >>> result = re.match ( '(.*?)(01)', anotherTest )
    >>> result.group ( 0 )
    '01'

    What if we want to skip past the first two characters? This is possible by using “+”, which is similar to “*”, except that it matches at least one character:

    >>> result = re.match ( '(.+?)(01)', anotherTest )
    >>> result.group ( 0 )
    '0101'

    We can also match a range of characters. For example, we can match only the first four letters of the alphabet:

    >>> anotherTest = 'a101'
    >>> result = re.match ( '[a-d]', anotherTest )
    >>> print result
    <_sre.SRE_Match object at 0x00B47B10>
    >>> anotherTest = 'q101'
    >>> result = re.match ( '[a-d]', anotherTest )
    >>> print result
    None

    We can also match one of a few patterns using “|”::

    >>> testA = 'a'
    >>> testB = 'b'
    >>> result = re.match ( '(a|b)', testA )
    >>> print result
    <_sre.SRE_Match object at 0x00B46D60>
    >>> result = re.match ( '(a|b)', testB )
    >>> print result
    <_sre.SRE_Match object at 0x00B46E60>

    Finally, there are a number of special sequences. “\A” matches at the start of a string. “\Z” matches at the end of a string. “\d” matches a digit. “\D” matches anything but a digit. “\s” matches whitespace. “\S” matches anything but whitespace.

    We can name our groups:

    >>> nameTest = 'hot sauce'
    >>> result = re.match ( '(?P<one>hot)', nameTest )
    >>> result.group ( 'one' )
    'hot'

    We can compile patterns to use them multiple times with the re module, too:

    >>> ourPattern = re.compile ( '(.*?)(the)' )
    >>> testString = 'This is the dog and the cat.'
    >>> result = ourPattern.match ( testString )
    >>> result.group ( 0 )
    'This is the'

    Of course, you can do more than match and extract substrings. You can replace things, too:

    >>> someString = 'I have a dream.'
    >>> re.sub ( 'dream', 'dog', someString )
    'I have a dog.'

    On a final note, you should not use regular expressions to match or replace simple strings.

    Conclusion

    Now you have a basic knowledge of string manipulation in Python behind you. As I explained at the very beginning of the article, string manipulation is necessary to many applications, both large and small. It is used frequently, and a basic knowledge of it is critical.

    from:http://www.devshed.com/c/a/Python/String-Manipulation/

    posted on 2005-11-19 00:31 weidagang2046 閱讀(535) 評論(0)  編輯  收藏 所屬分類: Python


    只有注冊用戶登錄后才能發(fā)表評論。


    網(wǎng)站導(dǎo)航:
     
    主站蜘蛛池模板: 成人浮力影院免费看| 天天摸天天碰成人免费视频| 亚洲国产高清美女在线观看 | 久久久久国产精品免费免费不卡| 亚洲五月激情综合图片区| 四虎影视www四虎免费| 国产精品黄页免费高清在线观看| 亚洲精品美女久久久久| 国产一区视频在线免费观看| 国产va在线观看免费| 亚洲国产精品成人午夜在线观看| 亚洲色成人网站WWW永久| 欧洲黑大粗无码免费| 全黄大全大色全免费大片| 中文字幕精品三区无码亚洲| 亚洲色精品vr一区二区三区| 亚洲成在人线aⅴ免费毛片| 中文字幕av免费专区| 亚洲国产成人久久一区二区三区| 亚洲va久久久噜噜噜久久| 国产高清视频在线免费观看| 777成影片免费观看| 久久一区二区免费播放| 亚洲欧美精品午睡沙发| 91亚洲一区二区在线观看不卡| 亚洲国产日韩成人综合天堂| 97在线观免费视频观看| 国产激情免费视频在线观看| 免费人成再在线观看网站| 亚洲人配人种jizz| 亚洲国产老鸭窝一区二区三区| 亚洲AV无码一区二区三区国产| 成人免费视频77777| 99久热只有精品视频免费看| 成人自慰女黄网站免费大全| 337p日本欧洲亚洲大胆人人| 亚洲人成网国产最新在线| 亚洲美女免费视频| 亚洲电影免费在线观看| 亚洲av最新在线网址| 区久久AAA片69亚洲|