====== R ======
{{tag>data_analysis R 통계}}
==== Source Code/ Tip ====
[[https://statkclee.github.io/r-gapminder-kr/02-project-intro/index.html|RStudio와 함께하는 프로젝트 관리]]
* Package 설치시 Network 문제 [[http://rfriend.tistory.com/177|http://rfriend.tistory.com/177]]
Tools> Global Options...> Packages> __Use secure download method for HTTP (check 박스 해제)
출처: [[http://rfriend.tistory.com/177|http://rfriend.tistory.com/177]] [R, Python 분석과 프로그래밍 (by R Friend)]
* 한글 문제 [[http://r-bong.blogspot.com/2016/03/rstudio_26.html|http://r-bong.blogspot.com/2016/03/rstudio_26.html]]
* **NA 는 비교할 수 없음** ( A >0 이나 A!= "abc" 로 비교하면 NA는 빠짐)
* **replace 사용시 warning 조심..** NA가 있을 경우 원래 vector와 replace되는 vector 크기가 달라짐
* character로 파일 읽으면, 빈칸은 NA가 아니고 _ckgedit_QUOT__ckgedit____> character로 읽힘 : is.na 말고 ==_ckgedit_________QUOT__ckgedit> 써야 함
* 변수 등 실행하기:
* [[trivia:tips:npp_r_syntax_highlight]]
eval( parse( text=" " ) )
* 값 바꾸기
법인리스트$사업명re<-newvals[match(법인리스트$사업명,oldvals)]
* trim 함수
trim <- function (x) gsub("^s+|\\s+$", "", x)
* Error handling
err_msg<-tryCatch({
# Contents
}, error=function(e){
err_msg<-paste0( LOOP_IDX ," ERROR :",conditionMessage(e))
cat( LOOP_IDX ," ERROR :",conditionMessage(e))
return(err_msg)
}) # 예외처리 끝
print(err_msg)
* 현재 경로 얻기
paths<-paste0(enc2native( dirname(rstudioapi::getActiveDocumentContext()$path)) , "/teleseller/")
==== Library ====
* [[http://www.kdnuggets.com/2015/06/top-20-r-packages.html|TOP 20 R packages by popularity]]
* [[https://awesome-r.com|Awesome R]] : A curated list of awesome R packages and tools
* **형태소 사전** (170221) 한국정보화진흥원(NIA)은 한글 텍스트 분석의 핵심요소인 형태소 사전을 개발해 공개했다고 21일 밝혔다 [[http://m.news.naver.com/read.nhn?mode=LSD&mid=sec&sid1=105&oid=092&aid=0002111971|External Link]]
* **세종기업데이터** 본 코드는 세종기업데이터에 공개된 기업들의 재무데이터를 수집하게 작성하였습니다. [[https://github.com/mrchypark/sejongFinData|External Link]]
* ggmap으로 지도그리기 [[http://www.bloter.net/archives/243114|External Link]]
* 주택실거래가 분석 [[https://github.com/keepcosmos/budongsan|External Link]]
* [[https://cran.r-project.org/web/packages/editData/README.html|editData : RStudio Addin for edit data.frame ]]
=== data.frame ===
* Reading Data Fastly
fread("C:/Users/heeseoklee.HEESEOKLEE-/Desktop/자료들2017년/04_05 기업DB/KED/LGU.txt", sep="|", header=FALSE, colClasses=list(character=1:54))
=== RODBC ===
* excel, access 등에 접근 [[https://www.r-bloggers.com/getting-access-data-into-r/|https://www.r-bloggers.com/getting-access-data-into-r/]]
install.packages("RODBC")
library(RODBC)
#con<-odbcConnect("DB")
con <- odbcConnectAccess2007("c:/ .mdb",pw=" ")
table_name <- sqlTables(con, tableType = "TABLE")$TABLE_NAME
TBL <- sqlFetch(con2, table_name[1])
str(TBL)
head(TBL)
qry <- "SELECT * FROM XXX"
class <- sqlQuery(con2, qry)
str(class)
odbcCloseAll()
=== reshape2 ===
[[http://seananderson.ca/2013/10/19/reshape/|http://seananderson.ca/2013/10/19/reshape/]]
* melt : wide → long
melt( data, id.vals=c("KEYVALUE1","KEYVALUE2") )
키값 빼고 나머지는 다 row로
* dcast : long → wide
dcast( data, KEYVALUE1 + KEYVALUE2 ~ LONG_COLUMN )
=== igraph ===
* Rotate (network) graph [[https://stackoverflow.com/questions/18440292/a-way-to-rotate-plot-in-r|https://stackoverflow.com/questions/18440292/a-way-to-rotate-plot-in-r]]
* get coordinates by layout function
* using [[https://en.wikipedia.org/wiki/Rotation_matrix|rotation matrix]]
==== tidyverse ====
* [[http://ggplot2.tidyverse.org/|ggplot2]], for data visualisation.
* [[http://dplyr.tidyverse.org/|dplyr]], for data manipulation.
* [[http://tidyr.tidyverse.org/|tidyr]], for data tidying.
* [[http://readr.tidyverse.org/|readr]], for data import.
* [[http://purrr.tidyverse.org/|purrr]], for functional programming.
* [[http://tibble.tidyverse.org/|tibble]], for tibbles, a modern re-imagining of data frames.
* [[https://github.com/tidyverse/stringr|stringr]], for strings.
* [[https://github.com/hadley/forcats|forcats]], for factors.
* [[https://mrchypark.github.io/post/rtips-tbl-자료형에서-소수점을-출력해보자/|tbl 자료형에서 소수점 출력하기]]
=== tidyverse style guide ===
[[http://style.tidyverse.org/|http://style.tidyverse.org/]]
=== Dplyr ===
* 여러 컬럼 sum/average rowwise
KED_not_PF %>%
rowwise() %>%
mutate(평균매출액=mean(c(as.double(매출액3),as.double(매출액2),as.double(매출액1)), na.rm=TRUE) %>%
ungroup()
* mutate : 컬럼명 character vector 통해서
DATA %>% mutate(!!paste0("주요품목",1:10)[i] := gsub("응용SW_","",!!rlang::sym(paste0("주요품목",1:10)[i])))
=== ggplot2 ===
==== Web Apps ====
[[:software_development:html_css_javascript|HTML CSS Javascript]]
* [[https://www.r-bloggers.com/deploying-desktop-apps-with-r/|Deploying Desktop Apps with R]] : using Shiny \\ R portable + Google Chrome portable + shell launch script
=== htmlWidgets ===
[[https://www.htmlwidgets.org/showcase_leaflet.html|https://www.htmlwidgets.org/showcase_leaflet.html]]
== visNetwork ==
[[https://datastorm-open.github.io/visNetwork/|https://datastorm-open.github.io/visNetwork/]]
[[https://cran.r-project.org/web/packages/visNetwork/vignettes/Introduction-to-visNetwork.html|https://cran.r-project.org/web/packages/visNetwork/vignettes/Introduction-to-visNetwork.html]]
* 속도 문제 : 아래 방법들을 사용.. 이쁨을 포기하고 속도를 얻기
visIgraphLayout(layout = "layout.kamada.kawai")
visPhysics(stabilization = FALSE)
visEdges(smooth = FALSE)
* visEvents
== DT ==
[[https://rstudio.github.io/DT/|https://rstudio.github.io/DT/]]
* 숫자를 scale::comma() 를 통해서 표현하면 character 형이 되어버려서 sorting이 제대로 안됨 : \\ datatable(…) %>% formatCurrency() 등을 쓰도록 해야함
* callback
=== Interactive visNetwork plot with DT not using Shiny or Crosstalk ===
=== plotly ===
[[https://plotly-book.cpsievert.me/|https://plotly-book.cpsievert.me/]]
* Network graph : ggplot2 → plotly
=== crosstalk ===
[[https://rstudio.github.io/crosstalk/|https://rstudio.github.io/crosstalk/]]
Linking Views without Shiny DT, Plotly, …
=== RMarkdown ===
== flexdashboard ==
[[https://rmarkdown.rstudio.com/flexdashboard/using.html|https://rmarkdown.rstudio.com/flexdashboard/using.html]]
~~DISCUSSION~~