Ubuntu Forums ubuntu.com - launchpad.net - ubuntu help  

Go Back   Ubuntu Forums > The Ubuntu Forum Community > Other Community Discussions > Development & Programming > Programming Talk
Register Reset Password Forum Help Forum Council Search Today's Posts Mark Forums Read

Programming Talk
This forum is for all programming questions.
The questions do not have to be directly related to Ubuntu and any programming language is allowed.

 
Thread Tools Display Modes
Old January 4th, 2007   #1
pedrotuga
Way Too Much Ubuntu
 
pedrotuga's Avatar
 
Join Date: Dec 2005
Beans: 275
Ubuntu 9.04 Jaunty Jackalope
python - checking if any of the values in the list in a string

I have a list of stopwords.

is there an imediate way to check if a string contains any of those or do i have to loop it and test one by one?

also, how do i lowercase a a string?

Last edited by pedrotuga; January 4th, 2007 at 09:04 PM..
pedrotuga is offline   Reply With Quote
Old January 4th, 2007   #2
Steveire
A Carafe of Ubuntu
 
Join Date: Apr 2006
My beans are hidden!
Re: python - checking if any of the values in the list in a string

Code:
stopwords = ["stop", "halt", "freeze"]
for word in stopwords:
    if stopword in string:
        return
Code:
string = "erDRTGrfg"
string.lower()
__________________
KDE/Akonadi developer

Last edited by Steveire; January 4th, 2007 at 09:27 PM..
Steveire is offline   Reply With Quote
Old January 4th, 2007   #3
duff
Ubuntu House Blend
 
duff's Avatar
 
Join Date: Nov 2004
Location: Clemson, SC
Beans: 271
Re: python - checking if any of the values in the list in a string

or
Code:
stopwords=set('stop','halt','freeze')
stopwords.intersection(set(word.split()))
the resulting set will be empty if word doesn't contain any stopwords.
duff is offline   Reply With Quote
Old January 4th, 2007   #4
ghostdog74
Iced Blended Vanilla Crème Ubuntu
 
Join Date: Sep 2006
Beans: 2,719
Re: python - checking if any of the values in the list in a string

Quote:
Originally Posted by pedrotuga View Post
I have a list of stopwords.

is there an imediate way to check if a string contains any of those or do i have to loop it and test one by one?

also, how do i lowercase a a string?
you can just use "in" keyword
Code:
stopwords = ["stop", "halt", "freeze"]
if not word in stopwords:
   print "Word is not in stopwords"
to change string to lowercase
Code:
astring = "THIS STRING IS UPPER" #note: don't use "string" as a variable name.
print astring.lowercase()
ghostdog74 is offline   Reply With Quote
Old January 5th, 2007   #5
pedrotuga
Way Too Much Ubuntu
 
pedrotuga's Avatar
 
Join Date: Dec 2005
Beans: 275
Ubuntu 9.04 Jaunty Jackalope
Re: python - checking if any of the values in the list in a string

thanks everybody.

python cycle syntax is kind of elegant so the solution i said in the beggining is actualy very simpleas steve showed. Though, i think the intersection might be faster
pedrotuga is offline   Reply With Quote
Old January 5th, 2007   #6
pmasiar
Day Old Decaf
 
Join Date: Jun 2006
Location: CT, USA
Beans: 5,268
Ubuntu 6.10 Edgy
Re: python - checking if any of the values in the list in a string

Why guess, it's not hard to time it yourself:

Program to time:
Code:
import time

loopTimes = 10000000
stoplist = ["stop", "halt", "freeze"]
stopSet = set(["stop", "halt", "freeze"])
string = 'many differnet words which may containn stop or may not'
wordsSet = set(string.split())

def using_in(text, stopwords):
    "return 1 (true) if any of the stopwords are in text -- using in"
    for word in stopwords:
        if word in text:
            return 1
    return 0

def using_set(wordsSet, stopSet):
    "return 1 (true) if any of the stopwords are in text -- using sets"
    return len( wordsSet & stopSet)

tStart = time.time()
for ii in range(loopTimes):
    pass
t_empty = time.time() - tStart # empty loop

tStart = time.time()
for ii in range(loopTimes):
    res = using_in(string, stoplist)
t_in = time.time() - tStart # using in

tStart = time.time()
for ii in range(loopTimes):
    res = using_set(wordsSet, stopSet)
t_set = time.time() - tStart # using set
    
print loopTimes , 'using in:', t_in - t_empty
print loopTimes , 'using set:', t_set - t_empty
results of 3 runs (increasing loop times)

Code:
>>> ================================ RESTART 
100 using in: 0.0
100 using set: 0.0
>>> ================================ RESTART 
10000 using in: 0.0160000324249
10000 using set: 0.0149998664856
>>> ================================ RESTART 
1000000 using in: 1.78200006485
1000000 using set: 0.983999967575
>>> ================================ RESTART 
10000000 using in: 17.8279998302
10000000 using set: 9.82899999619
Result: Return of invested time (learning sets) will pay off if you run the code more than 1 billion times. Less than that, and plain loop is **faster**
pmasiar is offline   Reply With Quote
Old January 5th, 2007   #7
pmasiar
Day Old Decaf
 
Join Date: Jun 2006
Location: CT, USA
Beans: 5,268
Ubuntu 6.10 Edgy
Re: python - checking if any of the values in the list in a string

I was not happy with my previous solution: python mantra is that most obvious solution is the best. So i looked deeper. Sets approach has a little help: target string is preparsed to words. What I will preparse it for "in" approach too?

I added into obvious places these snippets:

Code:
wordlist = string.split()

def using_inlist(wordlist, stopwords):
    "using in from list"
    for word in stopwords:
        if word in wordlist:
            return 1
    return 0

tStart = time.time()
for ii in range(loopTimes):
    res = using_inlist(wordlist, stoplist)
t_inlist = time.time() - tStart # using in list

print loopTimes , 'using in list:', t_inlist - t_empty
result: Obvious apprach IS the fastest!
Code:
1000000 using in: 1.78099989891
1000000 using in list: 0.922000169754
1000000 using set: 0.952999830246
pmasiar is offline   Reply With Quote
Old January 5th, 2007   #8
slavik
Ubuntu Master Roaster
 
slavik's Avatar
 
Join Date: Jan 2006
My beans are hidden!
Ubuntu Jaunty Jackalope (testing)
Re: python - checking if any of the values in the list in a string

this problem is O(nm) complexity. you are matching all elements in one array to all elements in another array.
slavik is online now   Reply With Quote
Old April 10th, 2007   #9
RSL
5 Cups of Ubuntu
 
Join Date: Aug 2006
Location: Atlanta, GA
Beans: 34
Ubuntu 7.04 Feisty Fawn
Re: python - checking if any of the values in the list in a string

Aw... Not to be a jerk but this is one of those moments where Ruby's Enumerable#include? would be sweet, right?

Code:
%w{list of stop words}.include?(word)
Oh, crap. I just became _that_ guy.
RSL is offline   Reply With Quote
Old April 10th, 2007   #10
pmasiar
Day Old Decaf
 
Join Date: Jun 2006
Location: CT, USA
Beans: 5,268
Ubuntu 6.10 Edgy
Re: python - checking if any of the values in the list in a string

Code:
>>> stop_words = ['a', 'b', 'c', 'd']
>>> word = 'd'
>>> word in stop_words
True
It is python, after all. Simple things are obvious, hard are possible.
pmasiar is offline   Reply With Quote

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 05:01 PM.


vBulletin ©2000 - 2009, Jelsoft Enterprises Ltd. Ubuntu Logo, Ubuntu and Canonical © Canonical Ltd. Tango Icons © Tango Desktop Project. bilberry