行方向(縦方向)のデータを、文字列の完全一致でフィルタします。

元データ:data1

In [8]: data1.head()
Out[8]:
   age  workclass  fnlwgt     education  educational-num      marital-status  \
0   25    Private  226802          11th                7       Never-married
1   38    Private   89814       HS-grad                9  Married-civ-spouse
2   28  Local-gov  336951    Assoc-acdm               12  Married-civ-spouse
3   44    Private  160323  Some-college               10  Married-civ-spouse
4   18          ?  103497  Some-college               10       Never-married

          occupation relationship   race  gender  capital-gain  capital-loss  \
0  Machine-op-inspct    Own-child  Black    Male             0             0
1    Farming-fishing      Husband  White    Male             0             0
2    Protective-serv      Husband  White    Male             0             0
3  Machine-op-inspct      Husband  Black    Male          7688             0
4                  ?    Own-child  White  Female             0             0

   hours-per-week native-country income
0              40  United-States  <=50K
1              50  United-States  <=50K 2 40 United-States >50K
3              40  United-States   >50K
4              30  United-States  <=50K

基本的な使い方

数値条件指定の場合とほぼ同様ですが、検索する文字列をで囲むことが重要です。

例1

# incomeが<=50Kであるものに絞る
data1.query("income == '<=50K'").head()

例1の結果

   age workclass  fnlwgt     education  educational-num      marital-status  \
0   25   Private  226802          11th                7       Never-married
1   38   Private   89814       HS-grad                9  Married-civ-spouse
4   18         ?  103497  Some-college               10       Never-married
5   34   Private  198693          10th                6       Never-married
6   29         ?  227026       HS-grad                9       Never-married

          occupation   relationship   race  gender  capital-gain  \
0  Machine-op-inspct      Own-child  Black    Male             0
1    Farming-fishing        Husband  White    Male             0
4                  ?      Own-child  White  Female             0
5      Other-service  Not-in-family  White    Male             0
6                  ?      Unmarried  Black    Male             0

   capital-loss  hours-per-week native-country income
0             0              40  United-States  <=50K
1             0              50  United-States  <=50K
4             0              30  United-States  <=50K
5             0              30  United-States  <=50K
6             0              40  United-States  <=50K

例2

# incomeが<=50Kでないものに絞る
data1.query("income != '<=50K'").head()

例2の結果

    age         workclass  fnlwgt     education  educational-num  \
2    28         Local-gov  336951    Assoc-acdm               12
3    44           Private  160323  Some-college               10
7    63  Self-emp-not-inc  104626   Prof-school               15
10   65           Private  184454       HS-grad                9
14   48           Private  279724       HS-grad                9

        marital-status         occupation relationship   race gender  \
2   Married-civ-spouse    Protective-serv      Husband  White   Male
3   Married-civ-spouse  Machine-op-inspct      Husband  Black   Male
7   Married-civ-spouse     Prof-specialty      Husband  White   Male
10  Married-civ-spouse  Machine-op-inspct      Husband  White   Male
14  Married-civ-spouse  Machine-op-inspct      Husband  White   Male

    capital-gain  capital-loss  hours-per-week native-country income
2              0             0              40  United-States   >50K
3           7688             0              40  United-States   >50K
7           3103             0              32  United-States   >50K
10          6418             0              40  United-States   >50K
14          3103             0              48  United-States   >50K

条件を組み合わせる

例3

# incomeが<=50Kで、occupationが?でないものに絞る
data1.query("income == '<=50K' & occupation != '?'").head()

例3の結果

   age workclass  fnlwgt     education  educational-num      marital-status  \
0   25   Private  226802          11th                7       Never-married
1   38   Private   89814       HS-grad                9  Married-civ-spouse
5   34   Private  198693          10th                6       Never-married
8   24   Private  369667  Some-college               10       Never-married
9   55   Private  104996       7th-8th                4  Married-civ-spouse

          occupation   relationship   race  gender  capital-gain  \
0  Machine-op-inspct      Own-child  Black    Male             0
1    Farming-fishing        Husband  White    Male             0
5      Other-service  Not-in-family  White    Male             0
8      Other-service      Unmarried  White  Female             0
9       Craft-repair        Husband  White    Male             0

   capital-loss  hours-per-week native-country income
0             0              40  United-States  <=50K
1             0              50  United-States  <=50K
5             0              30  United-States  <=50K
8             0              40  United-States  <=50K
9             0              10  United-States  <=50K

複数の文字列を指定する

配列([])の中に、検索したい文字列を複数列挙する必要があります。文字列と文字列とはカンマで区切ります。

例4

# workclassがPrivate、Local-govに完全一致するものに絞る
data1.query("workclass == ['Private', 'Local-gov']").head()

例4の結果

   age  workclass  fnlwgt     education  educational-num      marital-status  \
0   25    Private  226802          11th                7       Never-married
1   38    Private   89814       HS-grad                9  Married-civ-spouse
2   28  Local-gov  336951    Assoc-acdm               12  Married-civ-spouse
3   44    Private  160323  Some-college               10  Married-civ-spouse
5   34    Private  198693          10th                6       Never-married

          occupation   relationship   race gender  capital-gain  capital-loss  \
0  Machine-op-inspct      Own-child  Black   Male             0             0
1    Farming-fishing        Husband  White   Male             0             0
2    Protective-serv        Husband  White   Male             0             0
3  Machine-op-inspct        Husband  Black   Male          7688             0
5      Other-service  Not-in-family  White   Male             0             0

   hours-per-week native-country income
0              40  United-States  <=50K
1              50  United-States  <=50K 2 40 United-States >50K
3              40  United-States   >50K
5              30  United-States  <=50K

当然、完全一致ではなく、部分一致でのフィルタも必要な場合がありますが、それは別記事にて。