• Title/Summary/Keyword: attribute tree

Search Result 105, Processing Time 0.021 seconds

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Management Guideline and Avifauca of Odaesan area in Odaesan National Park (오대산 국립공원 오대산 지역의 야생조류상 및 관리 방안)

  • 이우신;박찬열;조기현
    • Korean Journal of Environment and Ecology
    • /
    • v.10 no.1
    • /
    • pp.1-13
    • /
    • 1996
  • This study was conducted to investigate the avifauna and to suggest the management discipline for the protection of bird community in Odaesan National Park. Field survey was carried over 2 main trails by line transect method from mid June to early December in 1996. The 1st section included the area from Woljongas to Sangwonsa 7km distance. The 2nd section survey started from Sangwonsa via Bukdaesa and the summit of Odaesan to Sangwonsa 9.8 km distance. The study results were as follows ; The observed birds belonged to 9 orders 22families 52species, they also had Black Woodpecker(Dryocopus martius)designated as natural monument No. 242, Chinese Sparrow Hawk(Accipiter soloensis) and Kestrel(Falco tinnunculus) as natural munumet No. 323, Scops Owl(Out scops)and Korean Wood Owl(Strix aluco) as No. 324. These birds also were classified into 25 species for residents, 16 species for summer visitors, 8 species for passage migrants, 3 species for winter visitors, respectionely. The 2nd section showed a high species richness and individuals in every season, however, had a difference in species composition with 1st section. Nesting guild of breeding bird community used highly in order of bush, hole, and canopy as a nest resources. It is suggested that high bush-nesting guild had a deep relationship with bush layer located in the ecotone of 1st section and that located in the high elevated zone in 2nd section. Hole-nesting guild such as Black Woodpecker(Dryocopus martius), Gray-headed Woodpecker(Picus canus) and Great Spotted Woodpecker(Dendrocopos major) were surveyed only in 2nd section, so it could be attribute to the small fragmentation and the growing of high diameter at breast height(D.B.H) tree in 2nd section. It is urgent that the management of camping ground and people for the conservation of brook in 1nd section, trail protection for the prevention from trail enlargement in 2nd section for the bird protection. Artificial food in snowy winter will provide the good breeding condition with the residents and migrants. And, the endeavor to lessen the habitat fragmentation will be beneficial to the birds who have a large home range such as Black Woodpecker(Dryocopus martius) and Korean Wood Owl(Strix aluco). For the control of Domestic Dove(Columba livia) populations, it could be recommended that the elimination of their nesting resources by net.

  • PDF

Species-specific Growth Responses of Betula costata, Fraxinus rhynchophylla, and Quercus variabilis Seedlings to Open-field Artificial Warming (거제수나무, 물푸레나무, 굴참나무 묘목의 실외 인위적 온난화에 대한 수종 특이적 생장 반응)

  • Han, Saerom;An, Jiae;Yoon, Tae Kyung;Yun, Soon Jin;Hwang, Jaehong;Cho, Min Seok;Son, Yowhan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.16 no.3
    • /
    • pp.219-226
    • /
    • 2014
  • Evaluation of tree responses to temperature elevation is critical for a development of forest management techniques coping with climate change. We conducted a study on the growth responses of Betula costata, Fraxinus rhynchophylla, and Quercus variabilis seedlings to open-field artificial warming. Artificial warming set-up using infra-red heater was built in 2012 and the temperature in warmed plots was regulated to be consistently $3^{\circ}C$ higher than that of control plots. The seeds of three species were sown, and the responses of growth, biomass allocation, and net photosynthetic rate of newly-germinated seedlings on the open-field artificial warming were determined. As a result, the growth responses of the seedlings differed with the species. B. costata showed decreases in the height to diameter ratio (H/D ratio), biomass, root weight to shoot weight ratio, and net photosynthetic rate. However, root collar diameter (RCD), height, biomass, and net photosynthetic rate of Q. variabilis were increased, while the response of F. rhynchophylla was rather obscure. There was no significant difference between warmed and control plots in seedling growth for 3 species in July, whereas, RCD, height, and H/D ratio of Q. variabilis were increased and H/D ratio of B. costata was decreased in November under warming. Species-specific growth responses to warming were similar to the species-specific responses of net photosynthetic rate and biomass allocation; therefore, net photosynthetic rate and biomass allocation might attribute to growth responses to warming. Besides, a relatively obvious response in autumn compared to summer might be affected by the phenological change following artificial warming. Species-specific responses of three deciduous species to warming in this study could be applied to the development of adaptive forest management policies to climate change.

A Study on Operation Strategy by Multi-variate Regression of Deagu Arboretum Visitor's Satisfaction (대구수목원 이용객 만족모델을 통한 운영 방안 연구)

  • Kang, Kee-Rae
    • Journal of Korean Society of Forest Science
    • /
    • v.101 no.1
    • /
    • pp.36-45
    • /
    • 2012
  • Education on the environment and plants offered by arboretum for today's people not only contribute to foster a better natural environment in urban region but also provide visitors with decent refreshment environment and beyond. In the study, the author undertook the observation on usage behavior and satisfaction model of arboretum visitors expect and investigated the facilities and programs to be offered by arboretum in order to propose the opinion regarding the service. For observation size of variables in a multiple regression analysis of variables is influencing satisfaction rankings walks the line of flow, the educational effect on the environment, cleanliness of the facility, visits pay, natural beauty, diversity of trees, accessibility and friendliness of staff, expansion of facilities in the arboretum and appeared as a complement. In case of visitor attribute, the residents living near the facility showed the highest visit frequency of more than 5 times, especially as part of taking a walk. This proves that the visit to arboretum is considered as part of everyday life, and thus a new program and walk path as well as movement route are needed to be developed for the visitors. In the question relating to the facilities and operation programs in Daegu Arboretum, particularly the requests by visitors, they responded that the establishment of cultural event, beautiful natural scenery, refreshment and convenience facilities is the most critical issue. In addition, the management on withered trees and bare lands is an urgent issue as well. In this sense, the Operation and Management Strategies based upon the visitor behaviors and model of satisfaction are needed to deal with the adoption of diverse events and festivals joined by local residents, ombudsman program, environmental program development for students and teachers within the region, negligent bare lands and withered tree replacement, and cafeteria facility improvement and supplement as well as the bench marking of other facilities than arboretums located in other regions. These items are thought to be sufficiently dealt with by Daegu Arboretum having no more external resources. It is recognized that the visitor satisfaction begins from a minor thing, and a small difference determines a great satisfaction, and thus the software approach rather than hardware one is in need.

Ecological Changes of Insect-damaged Pinus densiflora Stands in the Southern Temperate Forest Zone of Korea (I) (솔잎혹파리 피해적송림(被害赤松林)의 생태학적(生態学的) 연구(研究) (I))

  • Yim, Kyong Bin;Lee, Kyong Jae;Kim, Yong Shik
    • Journal of Korean Society of Forest Science
    • /
    • v.52 no.1
    • /
    • pp.58-71
    • /
    • 1981
  • Thecodiplosis japonesis is sweeping the Pinus densiflora forests from south-west to north-east direction, destroying almost all the aged large trees as well as even the young ones. The front line of infestation is moving slowly but ceaselessly norhwards as a long bottle front. Estimation is that more than 40 percent of the area of P. densiflora forest has been damaged already, however some individuals could escapes from the damage and contribute to restore the site to the previous vegetation composition. When the stands were attacked by this insect, the drastic openings of the upper story of tree canopy formed by exclusively P. densiflora are usually resulted and some environmental factors such as light, temperature, litter accumulation, soil moisture and offers were naturally modified. With these changes after insect invasion, as the time passes, phytosociologic changes of the vegetation are gradually proceeding. If we select the forest according to four categories concerning the history of the insect outbreak, namely, non-attacked (healthy forest), recently damaged (the outbreak occured about 1-2 years ago), severely damaged (occured 5-6 years ago), damage prolonged (occured 10 years ago) and restored (occured about 20 years ago), any directional changes of vegetation composition could be traced these in line with four progressive stages. To elucidate these changes, three survey districts; (1) "Gongju" where the damage was severe and it was outbroken in 1977, (2) "Buyeo" where damage prolonged and (3) "Gochang" as restored, were set, (See Tab. 1). All these were located in the south temperate forest zone which was delimited mainly due to the temporature factor and generally accepted without any opposition at present. In view of temperature, the amount and distribution of precipitation and various soil factor, the overall homogeneity of environmental conditions between survey districts might be accepted. However this did not mean that small changes of edaphic and topographic conditions and microclimates can induce any alteration of vegetation patterns. Again four survey plots were set in each district and inter plot distance was 3 to 4 km. And again four subplots were set within a survey plot. The size of a subplot was $10m{\times}10m$ for woody vegetation and $5m{\times}5m$ for ground cover vegetation which was less than 2 m high. The nested quadrat method was adopted. In sampling survey plots, the followings were taken into account: (1) Natural growth having more than 80 percent of crown density of upper canopy and more than 5 hectares of area. (2) Was not affected by both natural and artificial disturbances such as fire and thinning operation for the past three decades. (3) Lower than 500 m of altitude (4) Less than 20 degrees of slope, and (5) Northerly sited aspect. An intensive vegetation survey was undertaken during the summer of 1980. The vegetation was devided into 3 categories for sampling; the upper layer (dominated mainly by the pine trees), the middle layer composed by oak species and other broad-leaved trees as well as the pine, and the ground layer or the lower layer (shrubby form of woody plants). In this study our survey was concentrated on woody species only. For the vegetation analysis, calculated were values of intensity, frequency, covers, relative importance, species diversity, dominance and similarity and dissimilasity index when importance values were calculated, different relative weights as score were arbitrarily given to each layer, i.e., 3 points for the upper layer, 2 for the middle layer and 1 for the ground layer. Then the formula becomes as follows; $$R.I.V.=\frac{3(IV\;upper\;L.)+2(IV.\;middle\;L.)+1(IV.\;ground\;L.)}{6}$$ The values of Similarity Index were calculated on the basis of the Relative Importance Value of trees (sum of relative density, frequency and cover). The formula used is; $$S.I.=\frac{2C}{S_1+S_2}{\times}100=\frac{2C}{100+100}{\times}100=C(%)$$ Where: C = The sum of the lower of the two quantitative values for species shared by the two communities. $S_1$ = The sum of all values for the first community. $S_2$ = The sum of all values for the second community. In Tab. 3, the species composition of each plot by layer and by district is presented. Without exception, the species formed the upper layer of stands was Pinus densiflora. As seen from the table, the relative cover (%), density (number of tree per $500m^2$), the range of height and diameter at brest height and cone bearing tendency were given. For the middle layer, Quercus spp. (Q. aliena, serrata, mongolica, accutissina and variabilis) and Pinus densiflora were dominating ones. Genus Rhodedendron and Lespedeza were abundant in ground vegetation, but some oaks were involved also. (1) Gongju district The total of woody species appeared in this district was 26 and relative importance value of Pinus densiflora for the upper layer was 79.1%, but in the middle layer, the R.I.V. for Quercus acctissima, Pinus densiflora, and Quercus aliena, were 22.8%, 18.7% and 10.0%, respectively, and in ground vegetation Q. mongolica 17.0%, Q. serrata 16.8% Corylus heterophylla 11.8%, and Q. dentata 11.3% in order. (2) Buyeo district. The number of species enumerated in this district was 36 and the R.I.V. of Pinus densiflora for the uppper layer was 100%. In the middle layer, the R.I.V. of Q. variabilis and Q. serrata were 8.6% and 8.5% respectively. In the ground vegetative 24 species were counted which had no more than 5% of R.I.V. The mean R.I.V. of P.densiflora ( totaling three layers ) and averaging four plots was 57.7% in contrast to 46.9% for Gongju district. (3) Gochang-district The total number of woody species was 23 and the mean R.I.V. of Pinus densiflora was 66.0% showing greater value than those for two former districts. The next high value was 6.5% for Q. serrata. As the time passes since insect outbreak, the mean R.I.V. of P. densiflora increased as the following order, 46.9%, 57.7% and 66%. This implies that P. densiflora was getting back to its original dominat state again. The pooled importance of Genus Quercus was decreasing with the increase of that for Pinus densiflora. This trend was contradict to the facts which were surveyed at Kyonggi-do area (the central temperate forest zone) reported previously (Yim et al, 1980). Among Genus Quercus, Quercus acutissina, warm-loving species, was more abundant in the southern temperature zone to which the present research is concerned than the central temperate zone. But vice-versa was true with Q. mongolica, a cold-loving one. The species which are not common between the present survey and the previous report are Corpinus cordata, Beltala davurica, Wisturia floribunda, Weigela subsessilis, Gleditsia japonica var. koraiensis, Acer pseudosieboldianum, Euonymus japonica var. macrophylla, Ribes mandshuricum, Pyrus calleryana var. faruiei, Tilia amurensis and Pyrus pyrifolia. In Figure 4 and Table 5, Maximum species diversity (maximum H'), Species diversity (H') and Eveness (J') were presented. The Similarity indices between districts were shown in Tab. 5. Seeing Fig. 6, showing two-dimensional ordination of polts on the basis of X and Y coordinates, Ai plots aggregate at the left site, Bi plots at lower site, and Ci plots at upper-right site. The increasing and decreasing patterns as to Relative Density and Relative Importance Value by genus or species were given in Fig. 7. Some of the patterns presented here are not consistent with the previously reported ones (Yim, et al, 1980). The present authors would like to attribute this fact that two distinct types of the insect attack, one is the short war type occuring in the south temperate forest zone, which means that insect attack went for a few years only, the other one is a long-drawn was type observed at the temperate forest zone in which the insect damage went on continuously for several years. These different behaviours of infestation might have resulted the different ways of vegetational change. Analysing the similarity indices between districts, the very convincing results come out that the value of dissimilarity index between A and B was 30%, 27% between B and C and 35% between A and C (Table 6). The range of similarity index was obtained from the calculation of every possible combinations of plots between two districts. Longer time isolation between communities has brought the higher value of dissimilarity index. The main components of ground vegetation, 10 to 20 years after insect outbreak, become to be consisted of mainly Genus Lespedeza and Rhododendron. Genus Quercus which relate to the top dorminant state for a while after insect attack was giving its place to Pinus densiflora. It was implied that, provided that the soil fertility, soil moisture and soil depth were good enough, Genus Quercuss had never been so easily taken ever by the resistant speeies like Pinus densiflora which forms the edaphic climax at vast areas of forest land. Usually they refer Quercus to the representative component of the undisturbed natural forest in the central part of this country.

  • PDF