DONE Handling of table column names and hlines across languages

  • State "DONE" from "TODO" 2010-04-23 Fri 09:42
  • State "TODO" from "PROPOSED" 2010-03-26 Fri 21:38
  • State "PROPOSED" from "" 2010-02-23 Tue 20:04
  • State "TODO" from "" 2010-02-23 Tue 12:22

Org-babel now supports three new header arguments, and new default behavior for handling horizontal lines in tables (hlines), column names, and rownames across all languages. These are as follows

:hlines
Can take on the values of "yes" or "no", with a default value of "no". These values have the following effects.
"no"
results in all hlines being stripped from the input table. In most languages this is the desired effect, as a raw 'hline symbol generally is interpreted as an unbound variable and leads to and error. The following table would previously have lead to an error but is now processed as shown.
#+tblname: many-cols
| a | b | c |
|---+---+---|
| d | e | f |
|---+---+---|
| g | h | i |

#+source: echo-table
#+begin_src python :var tab=many-cols
  return tab
#+end_src

#+results: echo-table
| a | b | c |
| d | e | f |
| g | h | i |
"yes"
leaves 'hlines in the table. This is the default for emacs-lisp which may want to handle hline symbols explicitly.
:colnames
Can take on the values of "yes", "no", or nil for unassigned. The default value is nil. These values have the following effects
nil
If an input table looks like it has column names (meaning if it's second row is an hline), then the column names will be removed from the table by org-babel before processing, then reapplied to the results, so for example the following code block has the effect shown.
#+tblname: less-cols
| a |
|---|
| b |
| c |

#+srcname: echo-table-again
#+begin_src python :var tab=less-cols
  return [[val + '*' for val in row] for row in tab]
#+end_src

#+results: echo-table-again
| a  |
|----|
| b* |
| c* |
"no"
No column name pre-processing will take place.
"yes"
Column names are removed and reapplied as with nil even if the table does not look like it has column names (i.e. the second row is not an hline)
:rownames
Can take on the values of "yes" or "no", with a default value of "no". These values have the following effects.
"no"
No row name pre-processing will take place.
"yes"
The first column of the table is removed from the table by org-babel before processing, and is then reapplied to the results. This has the effect shown below.
#+tblname: with-rownames
| one | 1 | 2 | 3 | 4 |  5 |
| two | 6 | 7 | 8 | 9 | 10 |

#+srcname: echo-table-once-again
#+begin_src python :var tab=with-rownames :rownames yes
  return [[val + 10 for val in row] for row in tab]
#+end_src

#+results: echo-table-once-again
| one | 11 | 12 | 13 | 14 | 15 |
| two | 16 | 17 | 18 | 19 | 20 |

Thanks to Julien Barnier for adding rownames support in R.

discussion

See also Support rownames and other org babel table features?

Julien Barnier has made a patch implementing rownames in R. This is in branch julien-barnier-R-rownames in the devel repo (commit f29c00432e6091bc1fbce8d1eb9052eff61da7b7)

There is a test file for column and rownames here.

From Tom Dye

IIUC, the difficulty is introduced by the difference between R, which keeps row names "under the hood," and org-mode, which doesn't have a concept of row names. So, the question becomes one of preserving R's row names in cases where that is desirable. Because it is not possible, AFAIK, to distinguish between an org-mode table created through an R call to print a data frame and one made with org-tbl, the onus is on the user to preserve R row names

One way would be to establish an idiom for exporting and importing R data frames and put it on Worg. This one works for me.

So, on the way out: cbind(row=rownames(df),df)

And, on the way in: df <- data.frame(x, row.names=1)

If you want, I can add this, or something like it, to org-babel-doc-R.

Also, I've been using the reshape package and a melt, cast sequence, which I use frequently, keeps the row names in the first column, so I only have to be conscious of preserving row names on the way back into org-mode.

Examples

DateKg
2010-02-2195.0
2010-02-2293.0
2010-02-2392.0
2010-02-2491.5
2010-02-2591.0
2010-02-2692.0

As things stand

python

return d

Results in error because of 'hline

We could remove the hline with the following, but need to think about whether to include the column names or not.

(defun org-babel-python-var-to-python (var)
  "Convert an elisp var into a string of python source code
specifying a var of the same value."
  (if (listp var)
      (concat "[" (mapconcat #'org-babel-python-var-to-python (remq 'hline var) ", ") "]")
      ;; (concat "[" (mapconcat #'org-babel-python-var-to-python var ", ") "]")
    (format "%S" var)))

That change would give this as result:

DateKg
2010-02-2195.0
2010-02-2293.0
2010-02-2392.0
2010-02-2491.5
2010-02-2591.0
2010-02-2692.0

R

d
DateKg
2010-02-2195.0

NB Is it unfortunate that a named simple vector doesn't get its names printed out with :colnames yes?

c(a=1,b=2)

This is because a 1d vector gets turned into a table with one column, and hence its names would be rownames, not column names. One has to transpose the vector in R to get the desired result.

t(c(a=1,b=2))

shell

echo $d

Error:

org-babel-sh-var-to-sh wrongly converts 'hline into "hline" resulting in error in orgtbl-to-generic. Could change last loine of org-babel-sh-var-to-sh

(if (stringp var) (format "%s" var) (format "%S" var))))

to

(cond ((eq var 'hline) var) ((stringp var) (format "%s" var)) (t (format "%S" var)))

But need to think about whether the hline should even be there at this stage, or whether hlines and column names should have been removed (at least hlines as 2nd element of elisp table).