Lecture 2: Data Frame, Matrix, List

Abhijit Dasgupta

September 19, 2018

Preamble

Practice makes perfect

R coding conventions

## [1] 1 2 3
## [1] 1 2 3 4 5 6 7

Google has a style guide for how to write R code

R packages

R packages live on CRAN and its mirrors. To install an R package:

or

R Packages

To use a package, or rather, use the functions from the package, you have to load it into R

We’ll talk about packages later in the semester.

We will concentrate now on what is known as Base R, that is, the functions that are available when R is installed

Loading data

We will usually load CSV files, since they are the easiest for R. The typical suggestion if you have Excel data is to save the sheet as a CSV and then import it into R.

You can also load Excel files directly using either the readxl or rio packages

The structure of data sets

Tables

  • Data is typically in a rectangular format

    • spreadsheet, database table
    • CSV (comma-separated values) or TSV (tab-separated values) files
  • Characteristic

    • Rows are observations
    • Columns are variables
    • Each column has the same number of observations

Tidy data is a particularly amenable format for data analysis.

An example [GEO](http://www.ncbi.nlm.nih.gov/geo/) dataset

Lower back pain symptoms dataset on Kaggle.com

Breast Cancer Proteome dataset on Kaggle.com

Let’s look at a dataset

Let’s look at a dataset

  • Assumes that the first row has variable names
  • Replaces spaces with .
  • Keeps numeric and character variables together

Let’s look at a dataset

Let’s look at a dataset

## 'data.frame':    310 obs. of  13 variables:
##  $ Pelvic.incidence        : num  63 39.1 68.8 69.3 49.7 ...
##  $ Pelvic.tilt             : num  22.55 10.06 22.22 24.65 9.65 ...
##  $ Lumbar.lordosis.angle   : num  39.6 25 50.1 44.3 28.3 ...
##  $ Sacral.slope            : num  40.5 29 46.6 44.6 40.1 ...
##  $ Pelvic.radius           : num  98.7 114.4 106 101.9 108.2 ...
##  $ Degree.spondylolisthesis: num  -0.254 4.564 -3.53 11.212 7.919 ...
##  $ Pelvic.slope            : num  0.745 0.415 0.475 0.369 0.543 ...
##  $ Direct.tilt             : num  12.6 12.9 26.8 23.6 35.5 ...
##  $ Thoracic.slope          : num  14.5 17.5 17.5 12.7 16 ...
##  $ Cervical.tilt           : num  15.3 16.78 16.66 11.42 8.87 ...
##  $ Sacrum.angle            : num  -28.7 -25.5 -29 -30.5 -16.4 ...
##  $ Scoliosis.slope         : num  43.5 16.1 19.2 18.8 24.9 ...
##  $ Class.attribute         : Factor w/ 2 levels "Abnormal","Normal": 1 1 1 1 1 1 1 1 1 1 ...

So this is a data.frame object with 310 observations and 13 variables, of which one is a factor and the rest are numeric

It looks like a list of things

Dataframes

Dataframes are the primary mode of storing datasets in R

They were revolutionary in that they kept heterogeneous data together

They share properties of both a matrix and a list

## [1] "data.frame"

Technically, a data.frame is a list of vectors (or objects, generally) of the same length

Matrices

A matrix is a rectangular array of data of the same type

##      [,1] [,2] [,3] [,4]
## [1,]    0    0    0    0
## [2,]    0    0    0    0
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] "a"  "c"  "e"  "g"  "i"  "k"  "m"  "o"  "q"  "s"   "u"   "w"   "y"  
## [2,] "b"  "d"  "f"  "h"  "j"  "l"  "n"  "p"  "r"  "t"   "v"   "x"   "z"
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] "a"  "b"  "c"  "d"  "e"  "f"  "g"  "h"  "i"  "j"   "k"   "l"   "m"  
## [2,] "n"  "o"  "p"  "q"  "r"  "s"  "t"  "u"  "v"  "w"   "x"   "y"   "z"

Matrices

You can create a matrix from a set of vectors of the same length

Put columns together

##      [,1] [,2]
## [1,]    1   10
## [2,]    2   20
## [3,]    3   30
## [4,]    4   40

Matrices

You can create a matrix from a set of vectors of the same length

Put rows together

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40

Extracting elements

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40
## [1] 1 2 3 4
##      [,1] [,2]
## [1,]    2    3
## [2,]   20   30
## [1] 4

Matrix properties

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40
## [1] 2
## [1] 4
## [1] 2 4

Matrix arithmetic

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40
##      [,1] [,2] [,3] [,4]
## [1,]    6    7    8    9
## [2,]   15   25   35   45
##      [,1] [,2] [,3] [,4]
## [1,]    2    4    6    8
## [2,]   20   40   60   80

Two matrices

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40
##      [,1] [,2] [,3] [,4]
## [1,]    3    4    5    6
## [2,]    9   10   11   12
##      [,1] [,2] [,3] [,4]
## [1,]    4    6    8   10
## [2,]   19   30   41   52

Two matrices

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40
##      [,1] [,2] [,3] [,4]
## [1,]    3    4    5    6
## [2,]    9   10   11   12
##      [,1] [,2] [,3] [,4]
## [1,]    3    8   15   24
## [2,]   90  200  330  480

Two matrices

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]   10   20   30   40
## [3,]    3    4    5    6
## [4,]    9   10   11   12
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,]    1    2    3    4    3    4    5    6
## [2,]   10   20   30   40    9   10   11   12

Two matrices

## [1] 2 4
##      [,1] [,2]
## [1,]    3    9
## [2,]    4   10
## [3,]    5   11
## [4,]    6   12
##      [,1] [,2]
## [1,]   50  110
## [2,]  500 1100

Lists

Lists are collections of arbitrary objects in R

## [[1]]
## [1] "Andy"  "Brian" "Harry"
## 
## [[2]]
## [1] 12 16 16
## 
## [[3]]
## [1]  TRUE  TRUE FALSE
## 
## [[4]]
##      [,1] [,2] [,3]
## [1,]    1    1    1
## [2,]    1    1    1

Extracting elements from lists

## [1]  TRUE  TRUE FALSE
## [[1]]
## [1] "Andy"  "Brian" "Harry"
## 
## [[2]]
## [1] 12 16 16

Extracting elements from lists

##      [,1] [,2] [,3]
## [1,]    1    1    1
## [2,]    1    1    1
## [1] "matrix"
## [1] 1 1 1

Named lists

## [1] "Andy"  "Brian" "Harry"
## [1] "Andy"  "Brian" "Harry"
## [1] "Harry"

Back to a Data Frame

A data.frame object is a named list where each element is of the same length

You can use both matrix and list functions to operate on data.frame objects!!

Data Frames

##   Pelvic.incidence Pelvic.tilt Lumbar.lordosis.angle Sacral.slope
## 1         63.02782   22.552586              39.60912     40.47523
## 2         39.05695   10.060991              25.01538     28.99596
## 3         68.83202   22.218482              50.09219     46.61354
## 4         69.29701   24.652878              44.31124     44.64413
## 5         49.71286    9.652075              28.31741     40.06078
## 6         40.25020   13.921907              25.12495     26.32829
##   Pelvic.radius Degree.spondylolisthesis Pelvic.slope Direct.tilt
## 1      98.67292                -0.254400    0.7445035     12.5661
## 2     114.40543                 4.564259    0.4151857     12.8874
## 3     105.98514                -3.530317    0.4748892     26.8343
## 4     101.86850                11.211523    0.3693453     23.5603
## 5     108.16872                 7.918501    0.5433605     35.4940
## 6     130.32787                 2.230652    0.7899929     29.3230
##   Thoracic.slope Cervical.tilt Sacrum.angle Scoliosis.slope
## 1        14.5386      15.30468   -28.658501         43.5123
## 2        17.5323      16.78486   -25.530607         16.1102
## 3        17.4861      16.65897   -29.031888         19.2221
## 4        12.7074      11.42447   -30.470246         18.8329
## 5        15.9546       8.87237   -16.378376         24.9171
## 6        12.0036      10.40462    -1.512209          9.6548
##   Class.attribute
## 1        Abnormal
## 2        Abnormal
## 3        Abnormal
## 4        Abnormal
## 5        Abnormal
## 6        Abnormal

Data Frames

## [1] 310  13
## [1] 310

Data Frames

## [1] 22.55259 10.06099 22.21848 24.65288
## [1] 22.55259 10.06099 22.21848 24.65288

Data Frames

## [1] 22.55259 10.06099 22.21848 24.65288
## [1] 22.55259 10.06099 22.21848 24.65288
## [1] 22.55259 10.06099 22.21848 24.65288

Data Frames

My preference is for

  1. data frame named column extraction data_spine_small[,'Pelvic.tilt'],
  2. named list extraction data_spine_small[['Pelvic.tilt']]
  3. Dollar-based extraction data_spine_small$Pelvic.tilt

Data Frames

##  [1] "Pelvic.incidence"         "Pelvic.tilt"             
##  [3] "Lumbar.lordosis.angle"    "Sacral.slope"            
##  [5] "Pelvic.radius"            "Degree.spondylolisthesis"
##  [7] "Pelvic.slope"             "Direct.tilt"             
##  [9] "Thoracic.slope"           "Cervical.tilt"           
## [11] "Sacrum.angle"             "Scoliosis.slope"         
## [13] "Class.attribute"
##   Pelvic.tilt Pelvic.slope Class.attribute
## 1    22.55259    0.7445035        Abnormal
## 2    10.06099    0.4151857        Abnormal
## 3    22.21848    0.4748892        Abnormal
## 4    24.65288    0.3693453        Abnormal

Filtering data frames

Boolean operators

Operator Meaning
| Or
& And
! Not

Filtering data frames

##     Pelvic.incidence Pelvic.tilt Lumbar.lordosis.angle Sacral.slope
## 1           63.02782    22.55259              39.60912     40.47523
## 3           68.83202    22.21848              50.09219     46.61354
## 4           69.29701    24.65288              44.31124     44.64413
## 14          53.57217    20.46083              33.10000     33.11134
## 15          57.30023    24.18888              47.00000     33.11134
## 17          63.83498    20.36251              54.55243     43.47247
## 22          54.91944    21.06233              42.20000     33.85711
## 23          63.07361    24.41380              54.00000     38.65981
## 25          36.12568    22.75875              29.00000     13.36693
## 26          54.12492    26.65049              35.32975     27.47443
## 29          44.55101    21.93115              26.78592     22.61986
## 30          66.87921    24.89200              49.27860     41.98721
## 35          59.59554    31.99824              46.56025     27.59730
## 39          55.84329    28.84745              47.69054     26.99584
## 44          66.28539    26.32784              47.50000     39.95755
## 46          50.91244    23.01517              47.00000     27.89727
## 47          48.33264    22.22778              36.18199     26.10485
## 51          55.28585    20.44012              34.00000     34.84573
## 52          74.43359    41.55733              27.70000     32.87626
## 53          50.20967    29.76012              36.10401     20.44955
## 61          74.37768    32.05310              78.77201     42.32457
## 62          89.68057    32.70443              83.13073     56.97613
## 64          77.69058    21.38064              64.42944     56.30993
## 65          76.14721    21.93619              82.96150     54.21103
## 66          83.93301    41.28631              62.00000     42.64670
## 67          78.49173    22.18180              60.00000     56.30993
## 72          86.90079    32.92817              47.79435     53.97263
## 73          84.97413    33.02117              60.85987     51.95296
## 74          55.51221    20.09516              44.00000     35.41706
## 75          72.22233    23.07771              91.00000     49.14462
## 76          70.22145    39.82272              68.11840     30.39873
## 77          86.75361    36.04302              69.22104     50.71059
## 81          77.10657    30.46999              69.48063     46.63658
## 82          74.00554    21.12240              57.37950     52.88314
## 83          88.62391    29.08945              47.56426     59.53446
## 84          81.10410    24.79417              77.88702     56.30993
## 85          76.32600    42.39620              57.20000     33.92980
## 90          71.18681    23.89620              43.69667     47.29061
## 91          81.65603    28.74887              58.23282     52.90716
## 92          70.95273    20.15993              62.85911     50.79280
## 96          57.52236    33.64708              50.90986     23.87528
## 99          77.65512    22.43295              93.89278     55.22217
## 101         84.58561    30.36168              65.47949     54.22392
## 105         77.40933    29.39655              63.23230     48.01279
## 106         65.00796    27.60261              50.94752     37.40536
## 108         78.42595    33.42595              76.27744     45.00000
## 112         84.99896    29.61010              83.35219     55.38886
## 115         80.98807    36.84317              86.96060     44.14490
## 118         86.04128    38.75067              47.87140     47.29061
## 119         65.53600    24.15749              45.77517     41.37852
## 122         83.87994    23.07743              87.14151     60.80251
## 123         80.07491    48.06953              52.40344     32.00538
## 127         70.67690    21.70440              59.18116     48.97250
## 129         90.51396    28.27250              69.81394     62.24146
## 133         69.62628    21.12275              52.76659     48.50353
## 134         81.75442    20.12347              70.56044     61.63095
## 136         77.12134    30.34987              77.48108     46.77147
## 137         88.02450    39.84467              81.77447     48.17983
## 138         83.39661    34.31099              78.42329     49.08562
## 139         72.05403    24.70074              79.87402     47.35330
## 140         85.09550    21.06990              91.73479     64.02561
## 142         89.50495    48.90365              72.00342     40.60129
## 144         60.62622    20.59596              64.53526     40.03026
## 146         85.64379    42.68920              78.75066     42.95459
## 147         85.58171    30.45704              78.23138     55.12467
## 150         79.24967    23.94482              40.79670     55.30485
## 151         81.11260    20.69044              60.68701     60.42216
## 157         79.47698    26.73227              70.65098     52.74471
## 161         92.02631    35.39267              77.41696     56.63363
## 163        118.14465    38.44950              50.83852     79.69515
## 164        115.92326    37.51544              76.80000     78.40782
## 166         83.70318    20.26823              77.11060     63.43495
## 169         95.38260    24.82263              95.15763     70.55997
## 175         61.41174    25.38436              39.09687     36.02737
## 179         80.65432    26.34438              60.89812     54.30994
## 180         68.72191    49.43186              68.05601     19.29005
##     Pelvic.radius Degree.spondylolisthesis Pelvic.slope Direct.tilt
## 1        98.67292               -0.2544000  0.744503464     12.5661
## 3       105.98514               -3.5303173  0.474889164     26.8343
## 4       101.86850               11.2115234  0.369345264     23.5603
## 14      110.96670                7.0448029  0.081930993     15.0580
## 15      116.80659                5.7669469  0.416721511     16.5158
## 17      112.30949               -0.6225266  0.560675371     10.7690
## 22      125.21272                2.4325614  0.175244572     23.0791
## 23      106.42433               15.7796968  0.666388008     11.9696
## 25      115.57712               -3.2375625  0.126473707     25.6206
## 26      121.44701                1.5712048  0.928687869     14.6686
## 29      111.07292                2.6523206  0.527891438     32.4275
## 30      113.47702               -2.0058917  0.677267795     12.4271
## 35      119.33035                1.4742858  0.477087978      8.6051
## 39      123.31184                2.8124269  0.142325313     12.6634
## 44      121.21968               -0.7996245  0.647625866      9.0466
## 46      117.42226               -2.5267015  0.319204878     30.6389
## 47      117.38463                6.4817091  0.062276845     23.5538
## 51      115.87702                3.5583724  0.680654711     16.7110
## 52      107.94930                5.0000888  0.606767772     32.6283
## 53      128.29251                5.7406141  0.031139167     36.1431
## 61      143.56069               56.1259060  0.159377926     35.9529
## 62      129.95548               92.0272768  0.527584239     26.3756
## 64      114.81875               26.9318410  0.046225827     24.5548
## 65      123.93201               10.4319719  0.252796128     21.5934
## 66      115.01233               26.5881002  0.614766530      8.8345
## 67      118.53033               27.3832131  0.008485564      7.5647
## 72      135.07536              101.7190919  0.459674217     25.0986
## 73      125.65953               74.3334086  0.600115713     25.6364
## 74      122.64875               34.5529464  0.672041297     35.0909
## 75      137.73665               56.8040928  0.826532439     32.3379
## 76      148.52556              145.3781432  0.946610646     10.3840
## 77      139.41450              110.8607824  0.640487619     16.5571
## 81      112.15160               70.7590831  0.156502675     10.8490
## 82      120.20596               74.5551659  0.406965314     10.5895
## 83      121.76478               51.8058992  0.770613922     13.1962
## 84      151.83986               65.2146161  0.972005589     10.5715
## 85      124.26701               50.1274569  0.583098014     33.1635
## 90      119.86494               27.2839845  0.402241133      9.1296
## 91      114.76986               30.6091484  0.832811085     23.1811
## 92      116.17793               32.5223310  0.054842915      7.3173
## 96      140.98171              148.7537109  0.597457003     21.5943
## 99      123.05571               61.2111866  0.924902858     14.9502
## 101     108.01022               25.1184785  0.341664609     30.4108
## 105     118.45073               93.5637373  0.375287380     11.2385
## 106     116.58111                7.0159779  0.867324121     12.1292
## 108     138.55411               77.1551724  0.580604336     36.6285
## 112     126.91299               71.3211754  0.998826684      7.0551
## 115     141.08815               85.8721522  0.496180717     27.5223
## 118     122.09295               61.9882771  0.035210491     27.5499
## 119     136.44030               16.3780856  0.377683884     25.0670
## 122     124.64607               80.5556053  0.436932625      7.2994
## 123     110.70991               67.7273160  0.099940503     20.2822
## 127     103.00835               27.8101478  0.039655293     15.7748
## 129     100.89216               58.8236482  0.881441257     13.5739
## 133     116.80309               54.8168673  0.286894092     18.5916
## 134     119.42509               55.5068891  0.265889494     15.3790
## 136     110.61115               82.0936070  0.278327565     21.9069
## 137     116.60154               56.7660832  0.238984543     14.6182
## 138     110.46652               49.6720956  0.771860278     24.9264
## 139     107.17236               56.4261587  0.296152366      7.7545
## 140     109.06231               38.0328311  0.481861846     17.1681
## 142     134.63429              118.3533701  0.039380359     19.8712
## 144     117.22555              104.8592474  0.386903046     17.0217
## 146     105.14408               42.8874258  0.844294085     16.9272
## 147     114.86605               68.3761218  0.624974835     26.5001
## 150      98.62251               36.7063954  0.672569597     29.0324
## 151      94.01878               40.5109823  0.530804498     11.4265
## 157     118.58867               61.7005982  0.642084751      8.2975
## 161     115.72353               58.0575416  0.302441921     30.0162
## 163      81.02454               74.0437674  0.599392845     35.8563
## 164     104.69860               81.1989271  0.542816002     22.3317
## 166     125.48017               69.2795710  0.735958830     33.8124
## 169      89.30755               57.6608413  0.268276345     28.6901
## 175     103.40460               21.8434069  0.697750246      8.4084
## 179     120.10349               52.4675518  0.931603620     20.0845
## 180     125.01852               54.6912893  0.379895400     17.4281
##     Thoracic.slope Cervical.tilt Sacrum.angle Scoliosis.slope
## 1          14.5386      15.30468   -28.658501         43.5123
## 3          17.4861      16.65897   -29.031888         19.2221
## 4          12.7074      11.42447   -30.470246         18.8329
## 14         12.8127      12.00109    -1.734117         15.6205
## 15         18.6222       8.51898   -33.441303         13.2498
## 17         16.8116      11.41344     2.676002         17.3859
## 22         14.2195      14.14196     3.780394         24.9278
## 23         17.6891       7.63771   -14.183602         44.2338
## 25         15.7438      11.55610   -18.108941         24.1151
## 26         13.5700      16.12951   -17.630363         28.1902
## 29         10.2244      11.71324   -28.506125         28.0470
## 30          8.2495       7.58784    -3.963385         27.3587
## 35          8.3058       8.53700    -0.029028         40.5823
## 39          8.8550      10.55193   -16.404668         15.2954
## 44         10.2636      13.50349     1.138079         34.3683
## 46         18.6181      15.55901     2.537043          9.4310
## 47         11.0942      13.15072    -4.200276         20.0348
## 51         15.9714      14.37627     4.779509         43.2610
## 52          9.8062      11.62142   -10.028289          9.3141
## 53         13.9907      12.90967   -30.430498         31.6999
## 61         15.3975      11.71169   -18.628293         22.5623
## 62         18.6012      16.09596   -18.701330         35.8729
## 64         15.3209      11.10896   -23.279118         16.1132
## 65          7.8098      11.07095   -34.897660         43.1487
## 66          7.2405       9.79573   -20.130727         22.4032
## 67         12.6737       8.03422   -22.037558         32.0972
## 72          8.7655       8.93510   -21.318960         12.8518
## 73         17.7501       7.88600     4.442569         13.4605
## 74          8.6569      14.35843   -33.990906         25.5218
## 75          7.4437      16.44128    -9.487619          9.6867
## 76          9.5742      11.22353     4.641629          9.8472
## 77         17.3635      12.70134   -15.088499          7.0079
## 81         13.2216      11.55132    -1.860646         18.6033
## 82         12.5946      15.87462   -10.624659         31.9699
## 83         13.6590      15.27164    -4.208953         32.9340
## 84         11.2339      13.29506   -12.139219         11.8487
## 85          8.3830      13.75752   -32.106343         18.6868
## 90         16.7172       7.03060    -4.667443         14.5344
## 91         11.2491      11.69024   -25.011107         21.9180
## 92         16.3676      12.21365    -5.091336         17.2601
## 96          7.5666       7.81812   -27.570464         17.8768
## 99         15.0493       7.57722     0.307904         33.7201
## 101        15.7092      11.58279    -1.273566         29.6399
## 105        12.9197      13.82148     6.079425         11.8698
## 106        13.8536      11.95397   -20.735613          9.7675
## 108        16.6264       7.96524   -19.123087         16.1431
## 112         9.0119       9.85541   -19.314135         43.0086
## 115        13.5136      11.60893   -13.600284         34.3656
## 118        12.0080       9.62750     5.603229         36.2899
## 119        13.7801      14.63875   -15.898046         19.5298
## 122        11.1917      16.28150    -8.553212         24.8562
## 123        10.3082      15.89258   -14.156070         39.9730
## 127        14.8568      11.45991   -18.475476         19.8407
## 129        16.3289      16.52676   -10.917156         35.6543
## 133        15.4963      15.92252     1.320769         34.8665
## 134        18.1885      14.66728    -3.994716         33.2091
## 136        17.0071      16.22809    -0.055790         27.6595
## 137        12.1692       7.39996   -11.029157         40.7572
## 138         7.6245      13.23600   -21.449617         16.2153
## 139        10.8824       7.23178    -2.990023         29.9404
## 140         8.4727      11.98150   -25.387556          8.3163
## 142         8.9861      14.77008     6.868423         29.1844
## 144         8.8097      16.82108   -30.591567         35.4529
## 146         8.0109      15.08030    -1.056403         11.4148
## 147        11.2204      12.33663   -15.774389         40.4629
## 150        14.5804      16.56784    -0.269590         31.7726
## 151         8.3428       7.26048   -15.985527         29.5574
## 157         8.3840      11.96108     5.237386         10.8006
## 161         9.8318      11.21248   -19.264777         19.9972
## 163        10.9266      13.11354   -17.520810         21.2408
## 164         8.8519      11.48960    -6.754004         32.5082
## 166        15.1125      14.38085   -20.214168         11.7348
## 169         7.2124      13.13055    -6.412477         19.9792
## 175        15.6980      12.90325   -27.124475         22.9564
## 179        18.4914       8.98429   -19.329173         31.0862
## 180        16.0009      14.48126   -31.852569         30.5330
##     Class.attribute
## 1          Abnormal
## 3          Abnormal
## 4          Abnormal
## 14         Abnormal
## 15         Abnormal
## 17         Abnormal
## 22         Abnormal
## 23         Abnormal
## 25         Abnormal
## 26         Abnormal
## 29         Abnormal
## 30         Abnormal
## 35         Abnormal
## 39         Abnormal
## 44         Abnormal
## 46         Abnormal
## 47         Abnormal
## 51         Abnormal
## 52         Abnormal
## 53         Abnormal
## 61         Abnormal
## 62         Abnormal
## 64         Abnormal
## 65         Abnormal
## 66         Abnormal
## 67         Abnormal
## 72         Abnormal
## 73         Abnormal
## 74         Abnormal
## 75         Abnormal
## 76         Abnormal
## 77         Abnormal
## 81         Abnormal
## 82         Abnormal
## 83         Abnormal
## 84         Abnormal
## 85         Abnormal
## 90         Abnormal
## 91         Abnormal
## 92         Abnormal
## 96         Abnormal
## 99         Abnormal
## 101        Abnormal
## 105        Abnormal
## 106        Abnormal
## 108        Abnormal
## 112        Abnormal
## 115        Abnormal
## 118        Abnormal
## 119        Abnormal
## 122        Abnormal
## 123        Abnormal
## 127        Abnormal
## 129        Abnormal
## 133        Abnormal
## 134        Abnormal
## 136        Abnormal
## 137        Abnormal
## 138        Abnormal
## 139        Abnormal
## 140        Abnormal
## 142        Abnormal
## 144        Abnormal
## 146        Abnormal
## 147        Abnormal
## 150        Abnormal
## 151        Abnormal
## 157        Abnormal
## 161        Abnormal
## 163        Abnormal
## 164        Abnormal
## 166        Abnormal
## 169        Abnormal
## 175        Abnormal
## 179        Abnormal
## 180        Abnormal
##  [ reached 'max' / getOption("max.print") -- omitted 29 rows ]

Filtering data frames

##     Pelvic.incidence Pelvic.tilt Lumbar.lordosis.angle Sacral.slope
## 26          54.12492    26.65049              35.32975     27.47443
## 76          70.22145    39.82272              68.11840     30.39873
## 84          81.10410    24.79417              77.88702     56.30993
## 99          77.65512    22.43295              93.89278     55.22217
## 106         65.00796    27.60261              50.94752     37.40536
## 112         84.99896    29.61010              83.35219     55.38886
## 129         90.51396    28.27250              69.81394     62.24146
## 179         80.65432    26.34438              60.89812     54.30994
## 231         65.61180    23.13792              62.58218     42.47388
## 303         54.60032    21.48897              29.36022     33.11134
##     Pelvic.radius Degree.spondylolisthesis Pelvic.slope Direct.tilt
## 26       121.4470                 1.571205    0.9286879     14.6686
## 76       148.5256               145.378143    0.9466106     10.3840
## 84       151.8399                65.214616    0.9720056     10.5715
## 99       123.0557                61.211187    0.9249029     14.9502
## 106      116.5811                 7.015978    0.8673241     12.1292
## 112      126.9130                71.321175    0.9988267      7.0551
## 129      100.8922                58.823648    0.8814413     13.5739
## 179      120.1035                52.467552    0.9316036     20.0845
## 231      124.1280                -4.083298    0.9972475     30.0422
## 303      118.3433                -1.471067    0.9629075     30.8554
##     Thoracic.slope Cervical.tilt Sacrum.angle Scoliosis.slope
## 26         13.5700      16.12951   -17.630363         28.1902
## 76          9.5742      11.22353     4.641629          9.8472
## 84         11.2339      13.29506   -12.139219         11.8487
## 99         15.0493       7.57722     0.307904         33.7201
## 106        13.8536      11.95397   -20.735613          9.7675
## 112         9.0119       9.85541   -19.314135         43.0086
## 129        16.3289      16.52676   -10.917156         35.6543
## 179        18.4914       8.98429   -19.329173         31.0862
## 231        17.6222      13.39076   -16.369970         21.4495
## 303        11.4198      13.82322    -5.606449         18.5514
##     Class.attribute
## 26         Abnormal
## 76         Abnormal
## 84         Abnormal
## 99         Abnormal
## 106        Abnormal
## 112        Abnormal
## 129        Abnormal
## 179        Abnormal
## 231          Normal
## 303          Normal

Filtering data frames and selecting variables

##     Direct.tilt Class.attribute
## 26      14.6686        Abnormal
## 76      10.3840        Abnormal
## 84      10.5715        Abnormal
## 99      14.9502        Abnormal
## 106     12.1292        Abnormal
## 112      7.0551        Abnormal
## 129     13.5739        Abnormal
## 179     20.0845        Abnormal
## 231     30.0422          Normal
## 303     30.8554          Normal

Adding a variable

##   Pelvic.incidence Pelvic.tilt Lumbar.lordosis.angle Sacral.slope
## 1         63.02782    22.55259              39.60912     40.47523
## 2         39.05695    10.06099              25.01538     28.99596
## 3         68.83202    22.21848              50.09219     46.61354
## 4         69.29701    24.65288              44.31124     44.64413
##   Pelvic.radius Degree.spondylolisthesis Pelvic.slope Direct.tilt
## 1      98.67292                -0.254400    0.7445035     12.5661
## 2     114.40543                 4.564259    0.4151857     12.8874
## 3     105.98514                -3.530317    0.4748892     26.8343
## 4     101.86850                11.211523    0.3693453     23.5603
##   Thoracic.slope Cervical.tilt Sacrum.angle Scoliosis.slope
## 1        14.5386      15.30468    -28.65850         43.5123
## 2        17.5323      16.78486    -25.53061         16.1102
## 3        17.4861      16.65897    -29.03189         19.2221
## 4        12.7074      11.42447    -30.47025         18.8329
##   Class.attribute bad.angle
## 1        Abnormal        No
## 2        Abnormal       Yes
## 3        Abnormal        No
## 4        Abnormal        No

Removing a variable

Creating derived variables

Creating derived variables

For deriving multiple variables into a data frame

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

For deriving multiple variables into a data frame

For deriving multiple variables into a data frame

## 'data.frame':    32 obs. of  13 variables:
##  $ mpg    : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl    : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp   : num  160 160 108 258 360 ...
##  $ hp     : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat   : num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt     : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec   : num  16.5 17 18.6 19.4 17 ...
##  $ vs     : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am     : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear   : num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb   : num  4 4 1 1 2 1 4 2 2 4 ...
##  $ kmpg   : num  33.6 33.6 36.5 34.2 29.9 ...
##  $ low.mpg: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 2 1 1 1 ...

Adding new data to a data frame

You can concatenate two data frames using rbind as long as the variable names and orders are the same

##    Pelvic.incidence Pelvic.tilt Lumbar.lordosis.angle Sacral.slope
## 1          63.02782    22.55259              39.60912     40.47523
## 2          39.05695    10.06099              25.01538     28.99596
## 3          68.83202    22.21848              50.09219     46.61354
## 4          69.29701    24.65288              44.31124     44.64413
## 8          45.36675    10.75561              29.03835     34.61114
## 22         54.91944    21.06233              42.20000     33.85711
##    Pelvic.radius Degree.spondylolisthesis Pelvic.slope Direct.tilt
## 1       98.67292                -0.254400    0.7445035     12.5661
## 2      114.40543                 4.564259    0.4151857     12.8874
## 3      105.98514                -3.530317    0.4748892     26.8343
## 4      101.86850                11.211523    0.3693453     23.5603
## 8      117.27007               -10.675871    0.1319726     28.8165
## 22     125.21272                 2.432561    0.1752446     23.0791
##    Thoracic.slope Cervical.tilt Sacrum.angle Scoliosis.slope
## 1         14.5386      15.30468   -28.658501         43.5123
## 2         17.5323      16.78486   -25.530607         16.1102
## 3         17.4861      16.65897   -29.031888         19.2221
## 4         12.7074      11.42447   -30.470246         18.8329
## 8          7.7676       7.60961   -25.111459         26.3543
## 22        14.2195      14.14196     3.780394         24.9278
##    Class.attribute
## 1         Abnormal
## 2         Abnormal
## 3         Abnormal
## 4         Abnormal
## 8         Abnormal
## 22        Abnormal

Adding new data to a data frame

You can add columns of a new data frame to an existing data frame using cbind as long as the columns have no common names

##   Pelvic.slope Class.attribute Sex Race
## 1    0.7445035        Abnormal   M    W
## 2    0.4151857        Abnormal   F    B
## 3    0.4748892        Abnormal   M   As
## 4    0.3693453        Abnormal   M    B