====== R ====== {{tag>data_analysis R 통계}} ==== Source Code/ Tip ==== [[https://statkclee.github.io/r-gapminder-kr/02-project-intro/index.html|RStudio와 함께하는 프로젝트 관리]] * Package 설치시 Network 문제 [[http://rfriend.tistory.com/177|http://rfriend.tistory.com/177]] Tools> Global Options...> Packages> __Use secure download method for HTTP (check 박스 해제) 출처: [[http://rfriend.tistory.com/177|http://rfriend.tistory.com/177]] [R, Python 분석과 프로그래밍 (by R Friend)] * 한글 문제 [[http://r-bong.blogspot.com/2016/03/rstudio_26.html|http://r-bong.blogspot.com/2016/03/rstudio_26.html]] * **NA 는 비교할 수 없음** ( A >0 이나 A!= "abc" 로 비교하면 NA는 빠짐) * **replace 사용시 warning 조심..** NA가 있을 경우 원래 vector와 replace되는 vector 크기가 달라짐 * character로 파일 읽으면, 빈칸은 NA가 아니고 _ckgedit_QUOT__ckgedit____> character로 읽힘 : is.na 말고 ==_ckgedit_________QUOT__ckgedit> 써야 함 * 변수 등 실행하기: * [[trivia:tips:npp_r_syntax_highlight]] eval( parse( text=" " ) ) * 값 바꾸기 법인리스트$사업명re<-newvals[match(법인리스트$사업명,oldvals)] * trim 함수 trim <- function (x) gsub("^s+|\\s+$", "", x) * Error handling err_msg<-tryCatch({ # Contents }, error=function(e){ err_msg<-paste0( LOOP_IDX ," ERROR :",conditionMessage(e)) cat( LOOP_IDX ," ERROR :",conditionMessage(e)) return(err_msg) }) # 예외처리 끝 print(err_msg) * 현재 경로 얻기 paths<-paste0(enc2native( dirname(rstudioapi::getActiveDocumentContext()$path)) , "/teleseller/") ==== Library ==== * [[http://www.kdnuggets.com/2015/06/top-20-r-packages.html|TOP 20 R packages by popularity]] * [[https://awesome-r.com|Awesome R]] : A curated list of awesome R packages and tools * **형태소 사전** (170221) 한국정보화진흥원(NIA)은 한글 텍스트 분석의 핵심요소인 형태소 사전을 개발해 공개했다고 21일 밝혔다 [[http://m.news.naver.com/read.nhn?mode=LSD&mid=sec&sid1=105&oid=092&aid=0002111971|External Link]] * **세종기업데이터** 본 코드는 세종기업데이터에 공개된 기업들의 재무데이터를 수집하게 작성하였습니다. [[https://github.com/mrchypark/sejongFinData|External Link]] * ggmap으로 지도그리기 [[http://www.bloter.net/archives/243114|External Link]] * 주택실거래가 분석 [[https://github.com/keepcosmos/budongsan|External Link]] * [[https://cran.r-project.org/web/packages/editData/README.html|editData : RStudio Addin for edit data.frame ]] === data.frame === * Reading Data Fastly fread("C:/Users/heeseoklee.HEESEOKLEE-/Desktop/자료들2017년/04_05 기업DB/KED/LGU.txt", sep="|", header=FALSE, colClasses=list(character=1:54)) === RODBC === * excel, access 등에 접근 [[https://www.r-bloggers.com/getting-access-data-into-r/|https://www.r-bloggers.com/getting-access-data-into-r/]] install.packages("RODBC") library(RODBC) #con<-odbcConnect("DB") con <- odbcConnectAccess2007("c:/ .mdb",pw=" ") table_name <- sqlTables(con, tableType = "TABLE")$TABLE_NAME TBL <- sqlFetch(con2, table_name[1]) str(TBL) head(TBL) qry <- "SELECT * FROM XXX" class <- sqlQuery(con2, qry) str(class) odbcCloseAll() === reshape2 === [[http://seananderson.ca/2013/10/19/reshape/|http://seananderson.ca/2013/10/19/reshape/]] * melt : wide → long melt( data, id.vals=c("KEYVALUE1","KEYVALUE2") ) 키값 빼고 나머지는 다 row로 * dcast : long → wide dcast( data, KEYVALUE1 + KEYVALUE2 ~ LONG_COLUMN ) === igraph === * Rotate (network) graph [[https://stackoverflow.com/questions/18440292/a-way-to-rotate-plot-in-r|https://stackoverflow.com/questions/18440292/a-way-to-rotate-plot-in-r]] * get coordinates by layout function * using [[https://en.wikipedia.org/wiki/Rotation_matrix|rotation matrix]] ==== tidyverse ==== * [[http://ggplot2.tidyverse.org/|ggplot2]], for data visualisation. * [[http://dplyr.tidyverse.org/|dplyr]], for data manipulation. * [[http://tidyr.tidyverse.org/|tidyr]], for data tidying. * [[http://readr.tidyverse.org/|readr]], for data import. * [[http://purrr.tidyverse.org/|purrr]], for functional programming. * [[http://tibble.tidyverse.org/|tibble]], for tibbles, a modern re-imagining of data frames. * [[https://github.com/tidyverse/stringr|stringr]], for strings. * [[https://github.com/hadley/forcats|forcats]], for factors. * [[https://mrchypark.github.io/post/rtips-tbl-자료형에서-소수점을-출력해보자/|tbl 자료형에서 소수점 출력하기]] === tidyverse style guide === [[http://style.tidyverse.org/|http://style.tidyverse.org/]] === Dplyr === * 여러 컬럼 sum/average rowwise KED_not_PF %>% rowwise() %>% mutate(평균매출액=mean(c(as.double(매출액3),as.double(매출액2),as.double(매출액1)), na.rm=TRUE) %>% ungroup() * mutate : 컬럼명 character vector 통해서 DATA %>% mutate(!!paste0("주요품목",1:10)[i] := gsub("응용SW_","",!!rlang::sym(paste0("주요품목",1:10)[i]))) === ggplot2 === ==== Web Apps ==== [[:software_development:html_css_javascript|HTML CSS Javascript]] * [[https://www.r-bloggers.com/deploying-desktop-apps-with-r/|Deploying Desktop Apps with R]] : using Shiny \\ R portable + Google Chrome portable + shell launch script === htmlWidgets === [[https://www.htmlwidgets.org/showcase_leaflet.html|https://www.htmlwidgets.org/showcase_leaflet.html]] == visNetwork == [[https://datastorm-open.github.io/visNetwork/|https://datastorm-open.github.io/visNetwork/]] [[https://cran.r-project.org/web/packages/visNetwork/vignettes/Introduction-to-visNetwork.html|https://cran.r-project.org/web/packages/visNetwork/vignettes/Introduction-to-visNetwork.html]] * 속도 문제 : 아래 방법들을 사용.. 이쁨을 포기하고 속도를 얻기 visIgraphLayout(layout = "layout.kamada.kawai") visPhysics(stabilization = FALSE) visEdges(smooth = FALSE) * visEvents == DT == [[https://rstudio.github.io/DT/|https://rstudio.github.io/DT/]] * 숫자를 scale::comma() 를 통해서 표현하면 character 형이 되어버려서 sorting이 제대로 안됨 : \\ datatable(…) %>% formatCurrency() 등을 쓰도록 해야함 * callback === Interactive visNetwork plot with DT not using Shiny or Crosstalk === === plotly === [[https://plotly-book.cpsievert.me/|https://plotly-book.cpsievert.me/]] * Network graph : ggplot2 → plotly === crosstalk === [[https://rstudio.github.io/crosstalk/|https://rstudio.github.io/crosstalk/]] Linking Views without Shiny DT, Plotly, … === RMarkdown === == flexdashboard == [[https://rmarkdown.rstudio.com/flexdashboard/using.html|https://rmarkdown.rstudio.com/flexdashboard/using.html]] ~~DISCUSSION~~