研究での連携

統計科学研究センターの最も重要な柱の1つが統計科学・データ科学での革新的な研究の発展です。この革新的な研究発展のために、従来の研究ネットワークにさらに新味の追加を狙う制度が、本研究科のさまざまな分野のメンバーによるリアル世界の提案型協働研究プロジェクト制度と、本研究科を拠点として地球規模でのバーチャル世界の研究者コネクティッドソサエティ制度です。

30

提案型協働研究プロジェクト

各研究分野の研究者と統計科学を通して協働し、先端的な研究テーマの発見や新たなデータ分析法の開発研究・応用研究を展開する。

世界的協働ネットワーク

研究者Connected Society の構築。
協働プロジェクトや学術セミナーの実施や成果を基礎として、先鋭な研究トピックに特化したsmall society 型の世界的協働ネットワーク「研究者Connected Society 」の構築。

関係図img_w

委託研究・協働研究について

統計科学を基礎として実施する委託研究、協働研究では、データ分析を実施する部門と密接に話し合い、統計モデルの新たな開発や改良を行うとおもに、実際のデータに適用しつつ、必要があればさらにモデルを改善して目指す成果に到達できるようなフィードバック型研究開発を行っています。

ただいま企業様からの委託研究、協働研究の受け入れ制度について調整中です。
当面は東北大学産学連携機構相談窓口を通してコンタクトください。
(https://www.rpip.tohoku.ac.jp/jp/aboutus/desk/)

統計科学と各領域との連携

小児医療分野との事例

複雑な構造の高次元経時測定データへ適用する、関数混合効果モデル研究をしていた統計科学研究室と、入院を要する早産新生児のコルチゾールデータ変化の機序を解明しようとしていた名古屋市立大学小児科との共同研究です。コルチゾール値はヒトの生物学的リズムを知るマーカーで、バラツキが大きいため従来法では適切な解が得られなかった。関数混合効果モデルに分位点回帰を組み合わせて、コルチゾール値の変化機序を適切に捉えることができました。

30

Tanabe, Y., Araki, Y., Kinoshita, M. et al. Bayesian nonparametric quantile mixed-effects models via regularization using Gaussian process priors. Jpn J Stat Data Sci 5, 241–267 (2022).

要旨

In this study, we proposed using Bayesian nonparametric quantile mixed-effects models (BNQMs) to estimate the nonlinear structure of quantiles in hierarchical data. Assuming that a nonlinear function representing a phenomenon of interest cannot be specified in advance, a BNQM can estimate the nonlinear function of quantile features using the basis expansion method. Furthermore, BNQMs adjust the smoothness to prevent overfitting by regularization. 
We also proposed a Bayesian regularization method using Gaussian process priors for the coefficient parameters of the basis functions, and showed that the problem of overfitting can be reduced when the number of basis functions is excessive for the complexity of the nonlinear structure. Although computational cost is often a problem in quantile regression modeling, BNQMs ensure the computational cost is not too high using a fully Bayesian method. Using numerical experiments, we showed that the proposed model can estimate nonlinear structures of quantiles from hierarchical data more accurately than the comparison models in terms of mean squared error. Finally, to determine the cortisol circadian rhythm in infants, we applied a BNQM to longitudinal data of urinary cortisol concentration collected at Kurume University. The result suggested that infants have a bimodal cortisol circadian rhythm before their biological rhythms are established.

疫学分野との事例

この研究は、放射能影響研究所(広島)コロン教授との国際共同研究です。研究資料として用いた放影研「約5千人の日本人成人長期追跡データ」は、1957年という早期に収集開始され48年間という長期にわたる、日本では唯一無二の長期間・多項目検診の身体ダイナミクスデータです。本研究は生存時間解析での残差に着目して、被爆の影響の違いによる健康ダイナミクスの違いよりも、極端な体重変動を繰り返すヒトは死亡するリスクが高まることを厳密に論証しました。

30

Cologne J, Takahashi I, French B, et al. Association of Weight Fluctuation With Mortality in Japanese Adults. JAMA Netw Open. 2019;2(3):e190731.

要旨

Importance Weight cycling is associated with the risk of mortality from heart disease, but many studies have not distinguished between simple nonlinear (monotone) weight changes and more complex changes that reflect fluctuations. Objective To assess whether extreme body weight variation is associated with mortality after controlling for nonlinear weight changes. Design, Setting, and Participants In this prospective clinical cohort study, 4796 Japanese atomic bomb survivors were examined in the clinic as part of a biennial health examination and research program. The study consisted of a 20-year longitudinal baseline period (July 1, 1958, to June 30, 1978) and subsequent mortality follow-up of 27 years (July 1, 1978, to June 30, 2005) Participants were initially between the ages of 20 and 49 years during the baseline period and, throughout the baseline period, had no diagnoses of cardiovascular disease (CVD) or cancer and attended at least 7 of 10 scheduled examinations. Data analysis was performed from October 16, 2015, to May 13, 2016. Exposures Residual variability in body mass index (BMI) during the baseline period. Main Outcomes and Measures Outcomes were mortality from ischemic heart disease, cerebrovascular disease, other CVDs combined, other causes (except cancer), and cancer. Root mean squared error was calculated to capture individual residual variation in BMI after adjustment for baseline BMI trends, and the association of magnitude of residual variation with mortality was calculated as relative risk. Results In total, 4796 persons (mean [SD] age, 35.0 [7.3] years at first baseline examination; 3252 [67.8%] female; mean [SD] BMI, 21.2 [2.8] at first baseline visit [20.6 (2.4) among men and 21.5 (2.9) among women]) participated in the study. During follow-up, 1550 participants died: 82 (5.3% of all deaths) of ischemic heart disease, 181 (11.7%) of cerebrovascular disease, 186 (12.0%) of other CVDs, 615 (39.7%) of cancer, and 486 (31.3%) of other causes. Magnitude of residual variation in weight was associated with all-cause mortality (relative risk, 1.25 for 1 U of additional variation; 95% CI, 1.06-1.47) and ischemic heart disease mortality (relative risk, 2.49; 95% CI, 1.41-4.38). Conclusions and Relevance The findings suggest that an association exists between weight variation and heart disease mortality and that weight loss interventions, if deemed to be necessary, should be considered carefully.

工学・認知科学分野との事例

この研究は、静岡大学情報学部西田昌史研究室との研究です。手話研究分野に関数データ解析法を新たに導入して、手話の分析法と手話教育システムの改善につながる、多変量関数主成分分析法を使った手話動作分析を提案した研究です。元々、手話は連続的な動作により言語が構成され、手話動作を測定したデータは高次元時空間データに位置付けられます。本研究で、動作データに関数主成分分析を適用して、時空間の中の動作を少数の成分として抽出することができました。この成果は従来法である深層学習の手法では得られなかったものです。

30

櫻田 京之介、 荒木 由布子、 和泉 勇希、 西田 昌史、 手話データの多変量関数主成分分析、 ヒューマンインタフェース学会論文誌、 2020、 22 巻、 4 号、 p. 475-484.

要旨

The aim of this paper is to establish a novel statistical methods for characterizing the sign language movements at multiple body parts simultaneously. The method we applied is the multivariate functional principal components analysis (MFPCA), which is capable of capturing the individual variation of sign language movements using not only palm movements but also multiple movements such as fingers, elbows, and shoulders. This method successfully captures the characteristic that sign language is composed of a combination of multiple consecutive actions. We apply MFPCA to quantify the differences in variation of the performance among ten beginner and one master of the sing languages measured at nineteen body parts. The results of MFPCA quantify the individual qualities for the sign languages by making use of multivariate function principal component scores. At the same time, MFPCA revealed which part the characteristic movement of the individual sign language appears strongly. Finally, we distinguished some words that tend to be difficult or easy to learn.