Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
---
title: finstr - Financial Statements in R
output:
html_document:
keep_md: yes
---
```{r, echo=FALSE, results='hide', message=FALSE }
library(dplyr)
library(tidyr)
library(finstr)
data(xbrl_data_aapl2013)
data(xbrl_data_aapl2014)
```
**Warning: finstr package is in development.
Please use with caution.**
The purpose of finstr package is to use financial statements
data in more structured form and process.
For now it is offering:
1. Data structure for financial statements in tidy and usable format
2. Function to merge two reporting periods in single object
3. Some helper functions to help explore and manipulate the data in the
structure
The idea in long term is to create an environment for reproducible financial
statement analysis. With existing packages like XBRL for XBRL parsing,
dplyr for data manipulation and knitr for reproducible research, this
shouldn't be a long journey.
## Get data
Use XBRL package or `xbrl_parse_min` function to parse XBRL files. For example:
```{r xbrl_parse_min, eval=FALSE, echo=TRUE}
library(finstr)
# parse XBRL (Apple 10-K report)
xbrl_url2014 <-
"http://edgar.sec.gov/Archives/edgar/data/320193/000119312514383437/aapl-20140927.xml"
xbrl_url2013 <-
"http://edgar.sec.gov/Archives/edgar/data/320193/000119312513416534/aapl-20130928.xml"
xbrl_data_aapl2014 <- xbrl_parse_min(xbrl_url2014)
xbrl_data_aapl2013 <- xbrl_parse_min(xbrl_url2013)
```
With `xbrl_get_statements` convert XBRL data to *statements* object.
```{r xbrl_get_statements}
st2013 <- xbrl_get_statements(xbrl_data_aapl2013)
st2014 <- xbrl_get_statements(xbrl_data_aapl2014)
st2014
```
Statements object is a list of
several statement objects (ballance sheets, income and cash
flow statements) which are data frames with elements as columns and periods
as rows.
To get a single *statement* use *statements* object as a regular R list:
```{r statement}
balance_sheet2013 <- st2013$StatementOfFinancialPositionClassified
balance_sheet2014 <- st2014$StatementOfFinancialPositionClassified
balance_sheet2014
```
Only terminal (lowest level) concepts and values are kept in statement
object's columns.
Information about hierarchical definition of higher order concepts is stored
as an attribute to the statement object.
To see the calculation hierarchy of elements use `get_relations`:
```{r relations}
get_relations(balance_sheet2014)
```
To query the fundamental elements from higher order elements use
`get_elements`:
```{r elements}
get_elements(balance_sheet2014, parent_id = "LiabilitiesCurrent", as_data_frame = T)
```
## Merge statements from different periods
Use `merge` function to create single financial statement data from two
statements.
```{r merge_statement}
balance_sheet <- merge( balance_sheet2013, balance_sheet2014 )
```
The structure of merged balance sheets may differ because of taxonomy change.
Function `merge` takes care of structure change by expanding the element
hierarchy to capture the elements and their relations of both statements.
The values of any missing elements is set to 0.
To merge all statements from *statements* object use merge on statements objects:
```{r merge_statements}
# merge all statements
st_all <- merge( st2013, st2014 )
# check if statement of income is merged:
balance_sheet <- st_all$StatementOfFinancialPositionClassified
```
## Prepare data with higher order concepts
### Simple example
To get the higher order values in hierarhcy we have to sum the fundamental
element values. Function `expose` does it for us:
```{r expose1}
library(dplyr)
balance_sheet %>%
expose("Assets",
"Liabilities",
"CommintmentsAndContingencies",
"StockholdersEquity")
```
We could define new names for elements. Let say we would like to see *contingencies*
and *equity* summed up in the liabilities element:
```{r expose2}
balance_sheet %>%
expose("Assets",
Liabilities = c("Liabilities",
"CommintmentsAndContingencies",
"StockholdersEquity"))
```
### Using other
Function `other` sums everything not yet covered inside a higher order element.
To split the assets to current and non-current we can define non-current assets
as other assets after we "used" current assets:
```{r other1}
balance_sheet %>%
expose("AssetsCurrent",
NonCurrentAssets = other("Assets"),
Liabilities = other())
```
Note that we used `other` without element definition for the rest of the balance
sheet. In this case `other()` results in sum of everything not already
used.
### Without
Sometimes we need a substraction of concepts. For example:
```{r without1}
balance_sheet %>%
expose(
NonCurrentAssets = "Assets" %without% "AssetsCurrent",
CurrentAssets = other("Assets")
)
```
It is possible to substract several elements. For example:
```{r without2}
balance_sheet %>%
expose(
TangibleAssets =
"Assets" %without% c("Goodwill","IntangibleAssetsNetExcludingGoodwill"),
IntangibleAssets = other("Assets")
)
```
## Calculate new values and ratios
Statement object (in our case `balance_sheet`) is also a data frame object.
With elements (or concepts) as columns and time periods as rows.
It is possible then to use statement as a data frame:
Lets calculate current ratio which is defined by
$$ Current Ratio = \frac{Current Assets}{Current Liabilities} $$
```{r current_ratio}
library(dplyr)
balance_sheet %>%
expose("AssetsCurrent", "LiabilitiesCurrent") %>%
mutate(CurrentRatio = AssetsCurrent / LiabilitiesCurrent) %>%
select(endDate, CurrentRatio)
```
If we need a period average value we can use a `lag` function.
For example, to calculate *DSO* (days sales outstanding) over longer periods
the average of account receivable is compared to net sales.
We will use the formula for yearly statements:
$$ DSO = \frac{Average Accounts Receivable}{Sales Revenue} \times 365 $$
In this case we need to connect two type of statements: balance sheets and
income statements. With matching reporting periods it can be accomplished
with joining two data frames:
```{r DaysSalesOutstanding}
balance_sheet %>%
inner_join( st_all$StatementOfIncome, by = "endDate") %>%
mutate(
AccountReceivableLast = lag(AccountsReceivableNetCurrent),
AccountReceivableAvg = (AccountReceivableLast+AccountsReceivableNetCurrent)/2,
DaysSalesOutstanding = AccountReceivableAvg / SalesRevenueNet * 365
) %>%
select(endDate, DaysSalesOutstanding) %>%
na.omit()
```