If you have defined a DataFrame df which has a column a, then to access that column (a DataSeries object), you must enter df[a].
>
|
|
>
|
|
However, after using the with(df) command, you can use the column name a directly, without specifying the DataFrame to which the column belongs.
As illustrated in the previous example, a list of the column names bound to the top level is returned by with.
Sometimes a column name is identical to a top level name with a global meaning. Global names are still fully accessible. To access the global name a after having bound a DataFrame that has a column named a, use the prefix form of the operator, as in .
For example, the variable a may be in use.
After issuing a call to with(df), in which the name a occurs as a column label, the name a now refers to a column of data. The original variable is still available by using the syntax :-a.
In addition, two or more DataFrames may have columns with some of the same names. In this case, if both are bound using with, only the most recently bound value will be available at the top level by using a single variable name. In the following example, that is df2.
You can still refer to a column of any other DataFrame by naming the DataFrame it comes from, using syntax of the form df[':-a']. If you were to write just df[a], then the variable a would first evaluate to the corresponding column of df2, and df cannot infer which column is meant. This results in an error message. Using the syntax :-a makes sure that the global name a is used, not the column name from df2. In this case, that global name was assigned a value (namely, 12.3); we do not want :-a to evaluate to that value either: if it would, then df could still not infer which column is meant. In order to prevent :-a from evaluating, we use unevaluation quotes.