Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
217 views
in Technique[技术] by (71.8m points)

Creating a new field based on text matching and conditions from multiple fields. Python

I have a data frame like this, where I want to assigned a new category based on matching certain words in the "Review" field and certain "Product" types. I created two lists with different n-grams for each category and I need "My Category" selected based on those words in the lists matching to the words in the "Review" and certain product types (any of the product types selected). The code needs to assign multiple categories if needed.

Record ID Product Review My Category
123 Tablet Battery life sucks. Don't buy. Category 1
456 Laptop Love the sleek design, but battery life is bad. Category 2
789 Tablet I love it, even though it sucks sometimes. Category 1, Category 2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You should use apply for this task.

from io import StringIO
from io import StringIO

data = StringIO("""
Record ID   Product Review  
123 Tablet  Battery life sucks.
456 Laptop  Love the sleek design, but battery life is bad.
789 Tablet  I love it, even though it sucks sometimes.
""")

df = pd.read_csv(data, sep='')


def categorize(row):
    """Gets category from row
         Can access columns with dot notation, e.g., row.Product
    """
    # determine categories
    #return categories


df['categories'] = df.apply(categorize, axis=1)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...