I learn the basics of R with R studio.
Load the test dataset
data <- read.csv("petdf_mantle_xenolith.csv")
To see the first and last 6 rows. we use head() and tail()
head(data)
## SAMPLE_ID SAMPLE_NAME IGSN SAMPLE_TYPE LATITUDE LONGITUDE ELEVATION_MIN
## 1 11-01 11-01 Mineral 23.3994 58.1436 NA
## 2 11-10 11-10 Mineral 23.0494 58.6818 NA
## 3 11-33C 11-33C Mineral 23.8822 56.6274 NA
## 4 13-113 13-113 Mineral 25.4489 56.1223 NA
## 5 13-121 13-121 Mineral 25.4489 56.0957 NA
## 6 13-33 13-33 Whole Rock 23.6489 56.9362 NA
## ELEVATION_MAX TECTONIC_SETTING ROCK.NAME REFERENCE
## 1 NA OPHIOLITE LHERZOLITE PRIGENT, 2018[3698]
## 2 NA OPHIOLITE LHERZOLITE PRIGENT, 2018[3698]
## 3 NA OPHIOLITE LHERZOLITE PRIGENT, 2018[3698]
## 4 NA OPHIOLITE LHERZOLITE PRIGENT, 2018[3698]
## 5 NA OPHIOLITE LHERZOLITE PRIGENT, 2018[3698]
## 6 NA OPHIOLITE HARZBURGITE PRIGENT, 2018[3698]
......
tail(data)## SAMPLE_ID SAMPLE_NAME IGSN SAMPLE_TYPE LATITUDE LONGITUDE
## 21074 ZHANCHI-NCC-FANG-FC-7 FC-7 Mineral 35 118.5
## 21075 ZHANCHI-NCC-FANG-FC-8 FC-8 Mineral 35 118.5
## 21076 ZHANCHI-NCC-FANG-FC8-1 FC8-1 Mineral 35 118.5
## 21077 ZHANCHI-NCC-FANG-FC8-3 FC8-3 Whole Rock 35 118.5
## 21078 ZHANCHI-NCC-FANG-XK4-4 XK4-4 Mineral 35 118.5
## 21079 ZHOUCHI-HAN-090DA11 90DA11 Mineral 38 113.0
## ELEVATION_MIN ELEVATION_MAX TECTONIC_SETTING ROCK.NAME
## 21074 NA NA INTRAPLATE_CRATON PYROXENITE
## 21075 NA NA INTRAPLATE_CRATON PYROXENITE
## 21076 NA NA INTRAPLATE_CRATON PYROXENITE
## 21077 NA NA INTRAPLATE_CRATON PYROXENITE
## 21078 NA NA INTRAPLATE_CRATON PYROXENITE
## 21079 NA NA INTRAPLATE_CRATON PYROXENITE
........summary() give us a statistical summary of all the columns of the data
summary(data)## SAMPLE_ID SAMPLE_NAME IGSN SAMPLE_TYPE
## Length:21079 Length:21079 Length:21079 Length:21079
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## LATITUDE LONGITUDE ELEVATION_MIN ELEVATION_MAX
## Min. :-78.40 Min. :-176.500 Min. :-8782 Min. :-6483
## 1st Qu.: 22.00 1st Qu.: -7.954 1st Qu.:-4190 1st Qu.:-4100
## Median : 38.01 Median : 28.500 Median :-3540 Median :-3640
## Mean : 32.13 Mean : 25.234 Mean :-2898 Mean :-3195
## 3rd Qu.: 63.00 3rd Qu.: 100.000 3rd Qu.:-2382 3rd Qu.:-3073
## Max. : 86.76 Max. : 178.000 Max. : 3051 Max. : 4421
## NA's :18752 NA's :19331
......checking the specific columns and rows
for example to check the first row of data
data[1,]## SAMPLE_ID SAMPLE_NAME IGSN SAMPLE_TYPE LATITUDE LONGITUDE ELEVATION_MIN
## 1 11-01 11-01 Mineral 23.3994 58.1436 NA
## ELEVATION_MAX TECTONIC_SETTING ROCK.NAME REFERENCE METHOD
## 1 NA OPHIOLITE LHERZOLITE PRIGENT, 2018[3698] EMP[82443]
## EXPEDITION.ID SiO2 TiO2 Al2O3 Cr2O3 Fe2O3 Fe2O3T FeO FeOT NiO MnO MgO
## 1 47.91 0.38 10.61 0.64 NA NA 3.2 NA 0.07 0.05 19.43
## CaO SrO Na2O K2O P2O5 BaO LOI H2O H2OM H2OP SO2 SO3 V2O3 V2O5 ZnO CoO
## 1 12.72 NA 1.83 0.01 NA NA NA NA NA NA NA NA 0.15 NA NA NA
## La2O3 Ce2O3 O Si Fe Mn Ni Co Cu Cd Zn As Ag S CaCO3 CuO FeCO3 Gd2O3 HfO2
.....on the other hand, to chekck the first column data
data[,1]## [1] "11-01"
## [2] "11-10"
## [3] "11-33C"
## [4] "13-113"
## [5] "13-121"
## [6] "13-33"
## [7] "13-37A"
## [8] "13-37B"
## [9] "13-38"
## [10] "13-39"
## [11] "13-41"
## [12] "13-77"
## [13] "13-80"
## [14] "13-81"
## [15] "13-82"
.....
## [21078] "ZHANCHI-NCC-FANG-XK4-4"
## [21079] "ZHOUCHI-HAN-090DA11"selecting by the column name (TiO2)
data[,'TiO2']## [1] 0.380000000 0.040000000 0.080000000 0.070000000 0.030000000
## [6] NA NA 0.020000000 0.020000000 0.050000000
## [11] NA 0.010000000 NA 0.020000000 0.210000000
## [16] 0.050000000 ....an alternative way to select the TiO2 column
data$TiO2## [1] 0.380000000 0.040000000 0.080000000 0.070000000 0.030000000
## [6] NA NA 0.020000000 0.020000000 0.050000000
## [11] NA 0.010000000 NA 0.020000000 0.210000000
## [16] 0.050000000 ...Filtering the data
For example I choose the data contain TiO2 higher than 50 subset() function let us grab a subset oof values from the data.
subset(data,subset=TiO2>50)## SAMPLE_ID SAMPLE_NAME IGSN SAMPLE_TYPE
## 375 APPSAF-RIET-CMA10 CMA10 Whole Rock
## 391 APPSAF-RIET-CMA9 CMA9 Whole Rock
## 421 APPSAF-RIET-RTFN31.1 RTFN31.1 Mineral
## 424 APPSAF-RIET-RTFN43.1 RTFN43.1 Mineral
## 513 AULBCAN-BHT-K14-3A K14-3A Mineral
## 662 BEARUSS-KAND-SEC10XEN1 SEC10XEN1 Mineral
## 680 BEARUSS-KAND-SECTION5 SECTION5 Mineral
## 1081 CC-ME16 CC-ME16 Mineral
....
## SiO2 TiO2 Al2O3 Cr2O3 Fe2O3 Fe2O3T FeO FeOT NiO MnO MgO CaO
## 375 0.01 54.22 0.46 0.500 5.050 NA 26.22 NA NA 0.390 12.45 0.04
## 391 0.01 55.81 0.35 0.570 3.650 NA 28.67 NA NA 0.400 13.70 0.02
## 421 0.01 94.83 0.08 0.450 NA NA 3.02 NA NA 0.010 0.64 0.05
## 424 NA 55.54 0.28 0.330 3.830 NA 27.68 NA NA 0.350 12.30 0.03
## 513 41.83 53.77 ....Ordering the data
For example, I order the data based on TiO2 contents order() function let us grab a subset oof values from the data.
order.ti <- order(data[,'TiO2'])
head(data[order.ti,])## SAMPLE_ID SAMPLE_NAME IGSN SAMPLE_TYPE LATITUDE LONGITUDE
## 128 AII0020-SE09 SE9 Mineral 0.930 -29.370
## 137 AII0032-3-008-016 AII32-8-16 Mineral 43.210 -28.933
## 155 AII0093-5-009-HD3 AII 93-5-9HD3 Mineral -26.470 67.450
## 176 AII0107-6-035-004 AII107:35-4 Mineral -54.723 0.803
## 183 AII0107-6-040-006 AII107:40-6 Mineral -54.422 1.528
## 239 ANS0006-044-006 S6-44-6 Mineral 8.130 -40.578
## ELEVATION_MIN ELEVATION_MAX TECTONIC_SETTING ROCK.NAME
## 128 -1463 -2304 FRACTURE_ZONE PERIDOTITE
## 137 -2250 -2532 FRACTURE_ZONE PERIDOTITE
## 155 NA NA SPREADING_CENTER LHERZOLITE
## 176 -584 NA FRACTURE_ZONE PERIDOTITE
## 183 -2724 -3240 FRACTURE_ZONE PERIDOTITE
## 239 -3750 -3780 FRACTURE_ZONE,FRACTURE_ZONE PERIDOTITE
## REFERENCE METHOD EXPEDITION.ID SiO2 TiO2 Al2O3 Cr2O3 Fe2O3
## 128 RODEN, 1984[920] EMP[36614] AII0020 38.71 0 0.00 11.70 NA
## 137 SHIBATA, 1986[242] EMP[34444] AII0032-3 46.66 0 1.14 0.75 5.51
## 155 DICK, 1984[687] EMP[30634] AII0093-5 NA 0 51.70 14.90 4.03
## 176 DICK, 1989[281] EMP[42310] AII0107-6 55.55 0 3.18 0.64 4.22
## 183 DICK, 1989[281] EMP[36432] AII0107-6 51.52 0 4.70 29.09 3.34
## 239 BONATTI, 1992[290] EMP[75600] ANS0006 53.64 0 5.14 1.11 2.52
Basics
Addition
1+23Substraction
5-23Exponents
5^225Division
4/22Creating 2 vectors a and b, where a is (1,2,3) and b is (4,5,6)
a <- c(1:3)
a[1] 1 2 3b <- c(4:6)
b[1] 4 5 6Creating a 2 by 3 matrix from the vectors using rbind() function
rbind(a,b) [,1] [,2] [,3]
a 1 2 3
b 4 5 6Creating a 3 by 3 matrix consisting of the numbers 1-9
m1 <- matrix(1:9, byrow=F, nrow=3)
m1 [,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9Confirming that mat is a matrix.
is.matrix(mat)[1] TRUECreating a 5 by 5 matrix consisting of the numbers 1-25
m2 <- matrix(1:25, byrow=T, nrow=5)
m2 [,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 6 7 8 9 10
[3,] 11 12 13 14 15
[4,] 16 17 18 19 20
[5,] 21 22 23 24 25Sum of the matrix
sum(m2)[1] 325Sub-section of this matrix. Example row 4 and 5, collum 4 and 5
m2[4:5,4:5] [,1] [,2]
[1,] 19 20
[2,] 24 25Creating a 4 by 5 matrix consisting of random numbers (minimum 1 and maximum 2)
m3 <- matrix(runif(20, min = 1, max = 2), byrow=T, nrow = 4)
m3 [,1] [,2] [,3] [,4] [,5]
[1,] 1.498197 1.749353 1.598072 1.529297 1.384383
[2,] 1.172590 1.245288 1.234412 1.402424 1.422403
[3,] 1.817597 1.873271 1.135152 1.247186 1.634686
[4,] 1.684299 1.659339 1.124365 1.202919 1.840211Help with R
We can see the documentation/explanation using help()
help(vector)
Vectors
Description
vector produces a vector of the given length and mode.
as.vector, a generic, attempts to coerce its argument into a vector of mode mode (the default is to coerce to whichever vector mode is most convenient): if the result is atomic all attributes are removed.
is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise.
Usage
vector(mode = "logical", length = 0) as.vector(x, mode = "any") is.vector(x, mode = "any")
Arguments
mode | character string naming an atomic mode or |
length | a non-negative integer specifying the desired length. For a long vector, i.e., |
x | an R object. |
Details
The atomic modes are "logical", "integer", "numeric" (synonym "double"), "complex", "character" and "raw".
If mode = "any", is.vector may return TRUE for the atomic modes, list and expression. For any mode, it will return FALSE if x has any attributes except names. (This is incompatible with S.) On the other hand, as.vector removes all attributes including names for results of atomic mode (but not those of mode "list" nor "expression").
Note that factors are not vectors; is.vector returns FALSE and as.vector converts a factor to a character vector for mode = "any".
Value
For vector, a vector of the given length and mode. Logical vector elements are initialized to FALSE, numeric vector elements to 0, character vector elements to "", raw vector elements to nul bytes and list/expression elements to NULL.
For as.vector, a vector (atomic or of type list or expression). All attributes are removed from the result if it is of an atomic mode, but not in general for a list result. The default method handles 24 input types and 12 values of type: the details of most coercions are undocumented and subject to change.
For is.vector, TRUE or FALSE. is.vector(x, mode = "numeric") can be true for vectors of types "integer" or "double" whereas is.vector(x, mode =
"double") can only be true for those of type "double".
Methods for as.vector()
Writers of methods for as.vector need to take care to follow the conventions of the default method. In particular
Argument
modecan be"any", any of the atomic modes,"list","expression","symbol","pairlist"or one of the aliases"double"and"name".The return value should be of the appropriate mode. For
mode = "any"this means an atomic vector or list.Attributes should be treated appropriately: in particular when the result is an atomic vector there should be no attributes, not even names.
is.vector(as.vector(x, m), m)should be true for any modem, including the default"any".
Note
as.vector and is.vector are quite distinct from the meaning of the formal class "vector" in the methods package, and hence as(x, "vector") and is(x, "vector").
Note that as.vector(x) is not necessarily a null operation if is.vector(x) is true: any names will be removed from an atomic vector.
Non-vector modes "symbol" (synonym "name") and "pairlist" are accepted but have long been undocumented: they are used to implement as.name and as.pairlist, and those functions should preferably be used directly. None of the description here applies to those modes: see the help for the preferred forms.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
See Also
c, is.numeric, is.list, etc.
Examples
df <- data.frame(x = 1:3, y = 5:7) ## Error: try(as.vector(data.frame(x = 1:3, y = 5:7), mode = "numeric")) x <- c(a = 1, b = 2) is.vector(x) as.vector(x) all.equal(x, as.vector(x)) ## FALSE ###-- All the following are TRUE: is.list(df) ! is.vector(df) ! is.vector(df, mode = "list") is.vector(list(), mode = "list")
1 Comments
👏👏👏👏
ReplyDelete