Centiloid Longitudinal Analisis
Por revisar
OJO
Juntar datos
Sacar los datos de Centiloide es sencillo para cada visita usando xnat_pullcl.pl.
Pero la cuestion fundamental es que si vamos a analizar los datos longitudinalmente, necesitare la fecha de cada adquisicion. Asi que he modificado un poco el script y añadido la opcion -d para que también saque las fechas.
mas undocumented features, algo me da de aqui a seis meses.
Ahora lo que se hace es que por cada sujeto que se trae de XNAT, se saca el experimento PET y la fecha correspondiente.
if($with_date){
my $ptfile = $tmpdir.'/pet.results.tmp';
$order = "awk -F\",\" '{print \$2\",\"\$3\",\"\$6\",\"\$7}' ".$tmpdir."/xnat_pet_results.csv | sed 's/\"//g' | tail -n +2 | sort -t, -k 1 > ".$ptfile;
system($order);
open IDF, "<$ptfile" or die "No temporary file at all\n";
my $optfile = $tmpdir.'/pet.results';
open ODF, ">$optfile" or die "Could not open output file\n";
while(<IDF>){
my ($sbj, $pxp, $suvr, $cl) = /(.*),(.*),(.*),(.*)/;
#xnatapic list_experiments --project_id f5cehbi --subject_id XNAT_S00086 --modality PET --date | awk -F',' '{print $2}'
my $sorder = 'xnatapic list_experiments --project_id '.$xprj.' --subject_id '.$sbj." --modality PET --date | awk -F',' '{print \$2}'";
my ($xdate) = qx/$sorder/;
chomp $xdate;
print ODF "$sbj,$xdate,$suvr,$cl\n";
}
close ODF;
close IDF;
}else{
$order = "awk -F\",\" '{print \$2\",\"\$6\",\"\$7}' ".$tmpdir."/xnat_pet_results.csv | sed 's/\"//g' | tail -n +2 | sort -t, -k 1 > ".$tmpdir."/pet.results";
system($order);
}
y claro, el output hay que acomodarlo en consecuencia,
if($with_date){
$order = "join -t, ".$tmpdir."/xnat_subjects_sorted.list ".$tmpdir."/pet.results | sort -t, -k 2 | awk -F\",\" '{if (\$3) print \$2\";\"\$3\";\"\$4\";\"\$5}' | sed '1iSubject;Date;SUVR;Centiloid'";
}else{
$order = "join -t, ".$tmpdir."/xnat_subjects_sorted.list ".$tmpdir."/pet.results | sort -t, -k 2 | awk -F\",\" '{if (\$3) print \$2\";\"\$3\";\"\$4}' | sed '1iSubject;Date;SUVR;Centiloid'";
}
Ahora el primer paso es bajar los resultados de XNAT,
[osotolongo@brick03 lmodel_facehbi]$ xnat_pullcl.pl -d -x facehbi > facehbi_v0_cl.csv
[osotolongo@brick03 lmodel_facehbi]$ xnat_pullcl.pl -d -x f2cehbi > facehbi_v2_cl.csv
[osotolongo@brick03 lmodel_facehbi]$ xnat_pullcl.pl -d -x f5cehbi > facehbi_v5_cl.csv
Para armar la DB longitudinal el primer paso es pegar todo,
[osotolongo@brick03 lmodel_facehbi]$ awk -F";" '{print $1",0,"$2","$4}' facehbi_v0_cl.csv | sed 's/Subject,0/Subject,Visit/' > facehbi_v0_data.csv
[osotolongo@brick03 lmodel_facehbi]$ awk -F";" '{print $1",2,"$2","$4}' facehbi_v2_cl.csv | tail -n +2 > facehbi_v2_data.csv
[osotolongo@brick03 lmodel_facehbi]$ awk -F";" '{print $1",5,"$2","$4}' facehbi_v5_cl.csv | tail -n +2 > facehbi_v5_data.csv
[osotolongo@brick03 lmodel_facehbi]$ cat facehbi_v0_data.csv facehbi_v2_data.csv > tmp02.csv
[osotolongo@brick03 lmodel_facehbi]$ cat tmp02.csv facehbi_v5_data.csv > facehbi_data_unsorted.csv
[osotolongo@brick03 lmodel_facehbi]$ (head -n 1 facehbi_data_unsorted.csv; tail -n +2 facehbi_data_unsorted.csv | sort -t"," -k 1,2 ) > facehbi_data.csv
[osotolongo@brick03 lmodel_facehbi]$ head facehbi_data.csv
Subject,Visit,Date,Centiloid
F001,0,2014-12-11,-4.8572789076631135
F001,2,2017-01-26,-1.6070096780176448
F001,5,2020-02-06,5.995931423082964
F002,0,2014-12-11,11.328231785153815
F002,2,2017-04-20,8.026013832802747
F003,0,2014-12-18,-5.678781553555588
F003,2,2017-01-26,-6.6177593835407915
F004,0,2014-12-18,11.271685160898205
F005,0,2015-01-22,-6.979040300640364
Podemos asi ver la evolucion media de la deposicion a traves de la visita,
Se pude escribir un script para sacar los valores y deltas en cada punto
#!/usr/bin/perl
#
use strict;
use warnings;
use Date::Manip;
use Math::Round;
my @ifiles = ('facehbi_v0_cl.csv', 'facehbi_v2_cl.csv', 'facehbi_v5_cl.csv');
my %ncls = ( $ifiles[0] => 'v0CL', $ifiles[1] => 'v2CL', $ifiles[2] => 'v5CL');
my %pdates = ( $ifiles[0] => 'v0Date', $ifiles[1] => 'v2Date', $ifiles[2] => 'v5Date');
my %sdata;
for my $ifile (sort @ifiles){
open IDF, "<$ifile" or die "No such file \n";
while(<IDF>){
my ($sbj,$date,$cl) = /(F.*);(.*);.*;(.*)$/;
$sdata{$sbj}{$ncls{$ifile}} = $cl if defined $sbj;
#$date =~ s/-//g;
$sdata{$sbj}{$pdates{$ifile}} = $date if defined $sbj;
#print "$date\n";
}
close IDF;
}
print "Subject";
foreach my $vcl (sort keys %ncls){
print ",$ncls{$vcl}";
print ",$pdates{$vcl}";
}
print ",D02,D05,D25\n";
foreach my $sbj (sort keys %sdata){
print "$sbj";
foreach my $vcl (sort keys %ncls){
if(exists($sdata{$sbj}{$ncls{$vcl}})){
print ",$sdata{$sbj}{$ncls{$vcl}}";
}else{
print ",NA";
}
if(exists($sdata{$sbj}{$pdates{$vcl}})){
print ",$sdata{$sbj}{$pdates{$vcl}}";
}else{
print ",NA";
}
}
if(exists($sdata{$sbj}{$ncls{$ifiles[0]}}) and exists($sdata{$sbj}{$ncls{$ifiles[1]}})){
my $dif = $sdata{$sbj}{$ncls{$ifiles[1]}} - $sdata{$sbj}{$ncls{$ifiles[0]}};
my $d1 = ParseDate($sdata{$sbj}{$pdates{$ifiles[0]}});
my $d2 = ParseDate($sdata{$sbj}{$pdates{$ifiles[1]}});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
my $res = nearest(0.01, $dif/$ddif);
print ",$res";
}else{
print ",NA";
}
if(exists($sdata{$sbj}{$ncls{$ifiles[0]}}) and exists($sdata{$sbj}{$ncls{$ifiles[2]}})){
my $dif = $sdata{$sbj}{$ncls{$ifiles[2]}} - $sdata{$sbj}{$ncls{$ifiles[0]}};
my $d1 = ParseDate($sdata{$sbj}{$pdates{$ifiles[0]}});
my $d2 = ParseDate($sdata{$sbj}{$pdates{$ifiles[2]}});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
my $res = nearest(0.01, $dif/$ddif);
print ",$res";
}else{
print ",NA";
}
if(exists($sdata{$sbj}{$ncls{$ifiles[1]}}) and exists($sdata{$sbj}{$ncls{$ifiles[2]}})){
my $dif = $sdata{$sbj}{$ncls{$ifiles[2]}} - $sdata{$sbj}{$ncls{$ifiles[1]}};
my $d1 = ParseDate($sdata{$sbj}{$pdates{$ifiles[1]}});
my $d2 = ParseDate($sdata{$sbj}{$pdates{$ifiles[2]}});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
#print "\n --> $ddif\n";
my $res = nearest(0.01, $dif/$ddif);
print ",$res";
}else{
print ",NA";
}
print "\n";
}
En este punto es dificil distinguir entre acumuladores y no acumuladores pues el rate de aumento de amiloide es el mismo para todos.
Latent class mixed-effect model
-
-
Si consideramos que la variacion del centiloide se debe solo a la cantidad que existe y la predisposicion a acumularlo, vamos a intentar ver cuantos de los sujetos es sensible a que se acumule mas rapido. O algo asi.
Voy entonces a sacar el valor medio entre dos puntos y la pendiente en el tiempo.
Hala,
#!/usr/bin/perl
use strict;
use warnings;
use Date::Manip;
use Math::Round;
use Data::Dump qw(dump);
my %ldata;
my $ifile='facehbi_data.csv';
open IDF, "<$ifile" or die "No such file\n";
while (<IDF>){
if(/^F/){
my ($subject,$visit,$date,$cl) = /(.*),(.*),(.*),(.*)/;
$ldata{$subject}{$visit}{'Date'} = $date;
$ldata{$subject}{$visit}{'CL'} = $cl;
}
}
close IDF;
foreach my $subject (sort keys %ldata){
if (exists($ldata{$subject}{'0'}) and exists($ldata{$subject}{'0'}{Date})) {
if (exists($ldata{$subject}{'2'}) and exists($ldata{$subject}{'2'}{Date})){
my $dif = $ldata{$subject}{'2'}{'CL'} - $ldata{$subject}{'0'}{'CL'};
my $mean = 0.5*($ldata{$subject}{'2'}{'CL'} + $ldata{$subject}{'0'}{'CL'});
my $d1 = ParseDate($ldata{$subject}{'0'}{'Date'});
my $d2 = ParseDate($ldata{$subject}{'2'}{'Date'});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
$ldata{$subject}{'2'}{'DCl'} = nearest(0.01, $dif/$ddif);
$ldata{$subject}{'2'}{'CMean'} = $mean;
}else{
$ldata{$subject}{'2'}{'DCl'} = 'NA';
$ldata{$subject}{'2'}{'CMean'} = 'NA';
}
}else{
$ldata{$subject}{'2'}{'DCl'} = 'NA';
$ldata{$subject}{'2'}{'CMean'} = 'NA';
}
if (exists($ldata{$subject}{'2'}) and exists($ldata{$subject}{'2'}{Date})) {
if (exists($ldata{$subject}{'5'}) and exists($ldata{$subject}{'5'}{Date})){
my $dif = $ldata{$subject}{'5'}{'CL'} - $ldata{$subject}{'2'}{'CL'};
my $mean = 0.5*($ldata{$subject}{'5'}{'CL'} + $ldata{$subject}{'2'}{'CL'});
my $d1 = ParseDate($ldata{$subject}{'2'}{'Date'});
my $d2 = ParseDate($ldata{$subject}{'5'}{'Date'});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
$ldata{$subject}{'5'}{'DCl'} = nearest(0.01, $dif/$ddif);
$ldata{$subject}{'5'}{'CMean'} = $mean;
}else{
$ldata{$subject}{'5'}{'DCl'} = 'NA';
$ldata{$subject}{'5'}{'CMean'} = 'NA';
}
}else{
$ldata{$subject}{'5'}{'DCl'} = 'NA';
$ldata{$subject}{'5'}{'CMean'} = 'NA';
}
$ldata{$subject}{'0'}{'DCl'} = 'NA';
$ldata{$subject}{'0'}{'CMean'} = 'NA';
}
my $ofile = 'facehbi_data_deltas.csv';
open ODF, ">$ofile" or die "Could not create output file\n";
print ODF "Subject,Visit,Date,Centiloid,CMean,Delta\n";
foreach my $subject (sort keys %ldata){
foreach my $visit (sort keys %{$ldata{$subject}}){
if(exists($ldata{$subject}{$visit}) and exists($ldata{$subject}{$visit}{'Date'})){
print ODF "$subject,$visit,$ldata{$subject}{$visit}{'Date'},$ldata{$subject}{$visit}{'CL'},$ldata{$subject}{$visit}{'CMean'},$ldata{$subject}{$visit}{'DCl'}\n";
}
}
}
close ODF;
Y esto es lo que voy a pasar a R, pero antes
[osotolongo@brick03 lmodel_facehbi]$ sed 's/^F//' facehbi_data_deltas.csv > facehbi_data_deltasn.csv
y entonces,
> dd <- read.csv('facehbi_data_deltasn.csv')
> m0 <- lcmm(Delta ~ CMean, random = ~CMean, subject = 'Subject', mixture=~CMean, ng=2, link='linear', data=dd)
> summary(m0)
General latent class mixed model
fitted by maximum likelihood method
lcmm(fixed = Delta ~ CMean, mixture = ~CMean, random = ~CMean,
subject = "Subject", ng = 2, link = "linear", data = dd)
Statistical Model:
Dataset: dd
Number of subjects: 196
Number of observations: 314
Number of observations deleted: 231
Number of latent classes: 2
Number of parameters: 9
Link function: linear
Iteration process:
Convergence criteria satisfied
Number of iterations: 24
Convergence criteria: parameters= 5.1e-06
: likelihood= 5.2e-06
: second derivatives= 2.5e-10
Goodness-of-fit statistics:
maximum log-likelihood: -756.23
AIC: 1530.45
BIC: 1559.96
Maximum Likelihood Estimates:
Fixed effects in the class-membership model:
(the class of reference is the last class)
coef Se Wald p-value
intercept class1 -1.37467 0.71858 -1.913 0.05574
Fixed effects in the longitudinal model:
coef Se Wald p-value
intercept class1 (not estimated) 0
intercept class2 -0.11028 0.22263 -0.495 0.62035
CMean class1 0.09217 0.01507 6.118 0.00000
CMean class2 0.03010 0.00528 5.701 0.00000
Variance-covariance matrix of the random-effects:
intercept CMean
intercept 0.00134
CMean 0.00043 0.00014
Residual standard error (not estimated) = 1
Parameters of the link function:
coef Se Wald p-value
Linear 1 (intercept) 1.17087 0.46438 2.521 0.01169
Linear 2 (std err) 2.46514 0.10877 22.664 0.00000
> postprob(m0)
Posterior classification:
class1 class2
N 8.00 188.00
% 4.08 95.92
Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8008 0.1992
class2 0.1764 0.8236
Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 87.5 92.55
prob>0.8 37.5 55.32
prob>0.9 25.0 20.21
> dd$group <- factor(people$class[unlist(sapply(dd$Subject, function(x) which(people$Subject==x)))])
> plot(dd$CMean, dd$Delta, main="Centiloid variation", xlab="Mean Centiloid", ylab="Slope", pch=15, col=ifelse(dd$group==1,"red","blue"))
Puedo ver como se ven las clases segun la evolución del centiloide
library("lcmm")
dd <- read.csv('facehbi_data_deltasn.csv')
m0 <- lcmm(Delta ~ CMean, random = ~CMean, subject = 'Subject', mixture=~CMean, ng=2, link='linear', data=dd)
dd$Subject <- as.character(dd$Subject)
people <- as.data.frame(m0$pprob[,1:2])
dd$group <- factor(people$class[unlist(sapply(dd$Subject, function(x) which(people$Subject==x)))])
postscript("classes_evolution.ps", width=1024, height=600, bg="white")
plot(dd$CMean, dd$Delta, main="Centiloid variation", xlab="Mean Centiloid", ylab="Slope", pch=15, col=ifelse(dd$group==1,"red","blue"))
d1<- subset(dd, group==1)
d2<- subset(dd, group==2)
x <- c(-50:200)
a1 <- lm(d1$Delta ~ poly(d1$CMean, 3, raw=T))
y1 <- a1$coefficients[[1]] + a1$coefficients[[2]] * x + a1$coefficients[[3]] * x^2 + a1$coefficients[[4]]* x^3
points(x, y1, type="l", col="red", lw=3)
a2 <- lm(d2$Delta ~ poly(d2$CMean, 3, raw=T))
y2 <- a2$coefficients[[1]] + a2$coefficients[[2]] * x + a2$coefficients[[3]] * x^2 + a2$coefficients[[4]]* x^3
points(x, y2, type="l", col="blue", lw=3)
dev.off()
Como sacar las clases,
write.csv(people, "classification.csv", row.names=F)
y ahora,
[osotolongo@brick03 lmodel_facehbi]$ head classification.csv
"Subject","class"
1,2
2,2
3,2
5,2
6,2
7,2
8,2
9,1
10,2
[osotolongo@brick03 lmodel_facehbi]$ (head -n 1 classification.csv | sed 's/"//g' && tail -n +2 classification.csv | awk -F',' '{l = sprintf("F%03d,%d", $1, $2); print l}') > class_proper.csv
[osotolongo@brick03 lmodel_facehbi]$ head class_proper.csv
Subject,class
F001,2
F002,2
F003,2
F005,2
F006,2
F007,2
F008,2
F009,1
F010,2
Esto lo puedo pegar con las otros archivos,
[osotolongo@brick03 lmodel_facehbi]$ join -t, facehbi_data.csv class_proper.csv > facehbi_data_classes.csv
[osotolongo@brick03 lmodel_facehbi]$ join -t, facehbi_data_deltas.csv class_proper.csv > facehbi_deltas_classes.csv
Si esto lo pongo en la perspectiva inicial,
K-means clustering (kml)
Voy a intentar otra cosa porque aqui me quedan muy pocos clasificados como acumuladores.
Primero, no necesito el delta de 0 a 5. No voy a hacer nada con el.
Asi que voy a editar el //ajoin.pl//
#!/usr/bin/perl
#
use strict;
use warnings;
use Date::Manip;
use Math::Round;
my @ifiles = ('facehbi_v0_cl.csv', 'facehbi_v2_cl.csv', 'facehbi_v5_cl.csv');
my %ncls = ( $ifiles[0] => 'v0CL', $ifiles[1] => 'v2CL', $ifiles[2] => 'v5CL');
my %pdates = ( $ifiles[0] => 'v0Date', $ifiles[1] => 'v2Date', $ifiles[2] => 'v5Date');
my %sdata;
for my $ifile (sort @ifiles){
open IDF, "<$ifile" or die "No such file \n";
while(<IDF>){
my ($sbj,$date,$cl) = /(F.*);(.*);.*;(.*)$/;
$sdata{$sbj}{$ncls{$ifile}} = $cl if defined $sbj;
#$date =~ s/-//g;
$sdata{$sbj}{$pdates{$ifile}} = $date if defined $sbj;
#print "$date\n";
}
close IDF;
}
print "Subject";
foreach my $vcl (sort keys %ncls){
print ",$ncls{$vcl}";
print ",$pdates{$vcl}";
}
#print ",D02,D05,D25\n";
print ",D02,D25\n";
foreach my $sbj (sort keys %sdata){
print "$sbj";
foreach my $vcl (sort keys %ncls){
if(exists($sdata{$sbj}{$ncls{$vcl}})){
print ",$sdata{$sbj}{$ncls{$vcl}}";
}else{
print ",NA";
}
if(exists($sdata{$sbj}{$pdates{$vcl}})){
print ",$sdata{$sbj}{$pdates{$vcl}}";
}else{
print ",NA";
}
}
if(exists($sdata{$sbj}{$ncls{$ifiles[0]}}) and exists($sdata{$sbj}{$ncls{$ifiles[1]}})){
my $dif = $sdata{$sbj}{$ncls{$ifiles[1]}} - $sdata{$sbj}{$ncls{$ifiles[0]}};
my $d1 = ParseDate($sdata{$sbj}{$pdates{$ifiles[0]}});
my $d2 = ParseDate($sdata{$sbj}{$pdates{$ifiles[1]}});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
my $res = nearest(0.01, $dif/$ddif);
print ",$res";
}else{
print ",NA";
}
# if(exists($sdata{$sbj}{$ncls{$ifiles[0]}}) and exists($sdata{$sbj}{$ncls{$ifiles[2]}})){
# my $dif = $sdata{$sbj}{$ncls{$ifiles[2]}} - $sdata{$sbj}{$ncls{$ifiles[0]}};
# my $d1 = ParseDate($sdata{$sbj}{$pdates{$ifiles[0]}});
# my $d2 = ParseDate($sdata{$sbj}{$pdates{$ifiles[2]}});
# my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
# my $res = nearest(0.01, $dif/$ddif);
# print ",$res";
# }else{
# print ",NA";
# }
if(exists($sdata{$sbj}{$ncls{$ifiles[1]}}) and exists($sdata{$sbj}{$ncls{$ifiles[2]}})){
my $dif = $sdata{$sbj}{$ncls{$ifiles[2]}} - $sdata{$sbj}{$ncls{$ifiles[1]}};
my $d1 = ParseDate($sdata{$sbj}{$pdates{$ifiles[1]}});
my $d2 = ParseDate($sdata{$sbj}{$pdates{$ifiles[2]}});
my $ddif = Delta_Format(DateCalc($d1,$d2),2,"%hh")/(24*365.2425);
#print "\n --> $ddif\n";
my $res = nearest(0.01, $dif/$ddif);
print ",$res";
}else{
print ",NA";
}
print "\n";
}
y entonces tengo un archivo de datos como,
[osotolongo@brick03 lmodel_facehbi]$ head full_cl.csv
Subject,v0CL,v0Date,v2CL,v2Date,v5CL,v5Date,D02,D25
F001,-4.8572789076631135,2014-12-11,-1.6070096780176448,2017-01-26,5.995931423082964,2020-02-06,1.53,2.51
F002,11.328231785153815,2014-12-11,8.026013832802747,2017-04-20,NA,NA,-1.4,NA
F003,-5.678781553555588,2014-12-18,-6.6177593835407915,2017-01-26,NA,NA,-0.45,NA
F004,11.271685160898205,2014-12-18,NA,NA,NA,NA,NA,NA
F005,-6.979040300640364,2015-01-22,-11.232883025554106,2017-02-02,-1.8057342300172725,2020-02-06,-2.09,3.13
F006,37.656101185307456,2015-01-15,41.15717114649641,2017-01-26,NA,NA,1.72,NA
F007,77.10440832054385,2015-01-15,81.56812711267803,2017-01-26,108.1401345902,2019-12-19,2.2,9.18
F008,-16.304844472088252,2015-01-15,-19.94720239463048,2017-02-02,-4.065012783479986,2020-11-12,-1.78,4.21
F009,4.344542413279139,2015-01-29,6.640359126157083,2017-02-16,31.94283503876882,2020-02-13,1.12,8.46
Ahora puedo hacer,
library("kml")
read.csv("full_cl.csv") -> ct
kml::cld(ct, timeInData = 8:9, maxNA = 1) -> kct
kml::kml(kct, nbClusters = 2:3, nbRedrawing = 5)
getClusters(kct, 2, clusterRank = 1, asInteger = T) -> ct$k2c
getClusters(kct, 3, clusterRank = 1, asInteger = T) -> ct$k3c
write.csv(ct, "full_vclusters.csv", row.names=F)
Nota: Aqui al variable longitudinal no es el valor del Centiloide sino la Delta y la clasificacion en clusters ha de hacerse por esta ultima.
He sacado los clusters con 2 y 3 pero se ve claramente que esto se divide en solo 2 clusters. Vamos a verlo mas claro.
[osotolongo@brick03 lmodel_facehbi]$ awk -F"," '{print $1","$10","$11}' full_vclusters.csv > facehbi_kmlvclusters.csv
[osotolongo@brick03 lmodel_facehbi]$ join -t, facehbi_data.csv facehbi_kmlvclusters.csv > facehbi_data_vclusters.csv
[osotolongo@brick03 lmodel_facehbi]$ join -t, facehbi_data_deltas.csv facehbi_kmlvclusters.csv > facehbi_deltas_vclusters.csv
Voy a hacer los plots ahora
setwd("/nas/osotolongo/lmodel_facehbi")
read.csv("facehbi_data_deltas_vclusters.csv") -> dv
postscript("centiloid_delta_clusters.ps", width=1024, height=600, bg="white")
plot(dv$CMean, dv$Delta, main="Centiloid variation", xlab="Mean Centiloid", ylab="Slope", pch=15, col=factor(dv$k2c))
dev.off()
postscript("centiloid_delta_clusters_evolution.ps", width=1024, height=600, bg="white")
plot(dv$CMean, dv$Delta, main="Centiloid variation", xlab="Mean Centiloid", ylab="Slope", pch=15, col=ifelse(dv$k2c==1,"blue","red"))
x <- c(-50:200)
d1<- subset(dv, k2c==1)
d2<- subset(dv, k2c==2)
a1 <- lm(d1$Delta ~ poly(d1$CMean, 3, raw=T))
a2 <- lm(d2$Delta ~ poly(d2$CMean, 3, raw=T))
y1 <- a1$coefficients[[1]] + a1$coefficients[[2]] * x + a1$coefficients[[3]] * x^2 + a1$coefficients[[4]]* x^3
points(x, y1, type="l", col="blue", lw=3)
a2 <- lm(d2$Delta ~ poly(d2$CMean, 3, raw=T))
y2 <- a2$coefficients[[1]] + a2$coefficients[[2]] * x + a2$coefficients[[3]] * x^2 + a2$coefficients[[4]]* x^3
points(x, y2, type="l", col="red", lw=3)
dev.off()
fdv <- read.csv("facehbi_data_vclusters.csv")
library(ggplot2)
postscript("centiloid_clusters_by_visit.ps", width=1024, height=600, bg="white")
ggplot(fdv, aes(x = Visit, y = Centiloid, color = factor(1-k2c), group = Subject)) + geom_point() + geom_line() + theme(legend.position = "none")
dev.off()