September 27, 2012

Continuous dispersal on a discrete lattice

By Corey Chivers ¶ Posted in Probability, Rstats ¶ Tagged continuous, discrete, discrete space, dispersal, dispersal kernel, ecology, lattice, probability distribution, spatial, spatial spread, uniform random number ¶ 12 Comments

*Edit: I made a video!

Dispersal is a key process in many domains, and particularly in ecology. Individuals move in space, and this movement can be modelled as a random process following some kernel. The dispersal kernel is simply a probability distribution describing the distance travelled in a given time frame. Since space is continuous, it is natural to use a continuous kernel. However, some modelling frameworks are formulated on a lattice, or discrete array of patches.

So how can we implement a continuous kernel in discrete space?

As with many modelling situations, we could approach this in a number of ways. Here is the implementation that I can up with, and I welcome your suggestions, dear reader, for alternatives or improvements to this approach.

Draw a random variate d from the dispersal kernel.
Draw a uniform random number θ between 0 and 2π, which we will use to choose a direction.
Calculate the relative location (in continuous space) of the dispersed individuals using some trig:
x = cos(θ) d
y = sin(θ) d
Determine the new location on the lattice for each individual by adding the relative x and y positions to the current location. Round these locations to the nearest integer and take the modulo of this integer and the lattice size. This creates a torus out of the lattice such that the outer edges loop back on each other. If you remember the old Donkey Kong games, you can think of this like how when you leave out the right side of the screen you enter from the left.

I implemented this approach in R as a function that takes in a population in a lattice, and returns a lattice with the dispersed population. The user can also specify which dispersal kernel to use. Here is the result using a negative-exponential kernel on a 50×50 lattice.

Created by iterating over the dispersal function:

## General function to take in a lattice and disperse
## according to a user provided dispersal kernel
## Author: Corey Chivers
lat_disp<-function(pop,kernel,...)
{
lattice_size<-dim(pop)
new_pop<-array(0,dim=lattice_size)
for(i in 1:lattice_size[1])
{
for(j in 1:lattice_size[2])
{
N<-pop[i,j]
dist<-kernel(N,...)
theta<-runif(N,0,2*pi)
x<-cos(theta)*dist
y<-sin(theta)*dist

for(k in 1:N)
{
x_ind<-(round(i+x[k])-1) %% lattice_size[1] + 1
y_ind<-(round(j+y[k])-1) %% lattice_size[2] + 1
new_pop[x_ind,y_ind]<-new_pop[x_ind,y_ind]+1
}
}
}
return(new_pop)
}

For comparison, I also ran the same population using a Gaussian kernel. I defined the parameters of both kernels to have a mean dispersal distance of 1.

Here is the result using a Gaussian kernel:

The resulting population after 35 time steps has a smaller range than when using the exponential kernel, highlighting the importance of the shape of the dispersal kernel for spreading populations (remember that in both cases the average dispersal distance is the same).

Code for generating the plots:

############## Run and plot #######################

## Custom colour ramp
colours<-c('grey','blue','black')
cus_col<-colorRampPalette(colors=colours, space = c("rgb", "Lab"),interpolate = c("linear", "spline"))

## Initialize population array
Time=35
pop<-array(0,dim=c(Time,50,50))
pop[1,25,25]<-10000

### Normal Kernel ###
par(mfrow=c(1,1))
for(i in 2:Time)
{
image(pop[i-1,,],col=cus_col(100),xaxt='n',yaxt='n')
pop[i,,]<-lat_disp(pop[i-1,,],kernel=rnorm,mean=0,sd=1)
}

## Plot
png('normal_kern.png', width = 800, height = 800)
par(mfrow=c(2,2),pty='s',omi=c(0.1,0.1,0.5,0.1),mar=c(2,0,2,0))
times<-c(5,15,25,35)
for(i in times)
image(pop[i-1,,],
col=cus_col(100),
xaxt='n',
yaxt='n',
useRaster=TRUE,
main=paste("Time =",i))

mtext("Gaussian Kernel",outer=TRUE,cex=1.5)
dev.off()

### Exponential Kernel ###
par(mfrow=c(1,1))
for(i in 2:Time)
{
image(pop[i-1,,],col=cus_col(100),xaxt='n',yaxt='n')
pop[i,,]<-lat_disp(pop[i-1,,],kernel=rexp,rate=1)
}

## Plot
png('exp_kern.png',  width = 800, height = 800)
par(mfrow=c(2,2),pty='s',omi=c(0.1,0.1,0.5,0.1),mar=c(2,0,2,0))
times<-c(5,15,25,35)
for(i in times)
image(pop[i-1,,],
col=cus_col(100),
xaxt='n',
yaxt='n',
useRaster=TRUE,
main=paste("Time =",i))

mtext("Exponential Kernel",outer=TRUE,cex=1.5)
dev.off()

############################################################

September 14, 2012

Mapping Bike Accidents in R

By Corey Chivers ¶ Posted in Rstats ¶ Tagged accidents, animation, cycling, data, map, Montreal, plotting, transportation, visualization ¶ 3 Comments

At last weekend’s Hack Ta Ville event here in Montreal, I joined up with some talented urban planners and web devs to realize Vélobstacles. The idea of the project is to crowd source information on cycling conditions around the city. As with any crowd sourcing project, we were faced with the problem of seeding the site with some data to draw the attention of users to get the ball rolling.

Fortunately, we had access to a data set of all reported cycling accidents between 2006-2010. Once we seeded Vélobstacles with this data, the web devs went to town adding features to the site, and I had outlived my usefulness as a data geek. So I decided to play with the accident data a little and produce some visualization. I plotted all the accidents on a map and animated it through time. I also calculated and plotted the monthly accident rate using a moving average.

Be sure to select HD quality:

Not surprisingly, the accident rate goes way up in the summer months as Montreal winters are braved on two wheels by only a rarefied few. What is interesting is the mid-summer dip in the accident rate. This dip is notably correlated with Montreal’s much beloved construction holiday – though the causal relationship is unclear. If you have any alternative explanations, or an idea about how to test the construction holiday hypothesis, drop a note in the comments.

As always, you can get the code on my github page.

September 6, 2012

The future of Artificial Intelligence – as imagined in 1989

By Corey Chivers ¶ Posted in Probability, Rstats ¶ Tagged AI, Artificial Intelligence, futurism, illustration, metaphor, plotting, robot hand, robots, statistics, visual representations ¶ 1 Comment

This image comes from the cover of Preliminary Papers of the Second International Workshop on Artificial Intelligence and Statistics (1989). Someone abandoned it in the lobby of my building at school. Whatever for, I’ll never know.

I just love the idea of machine learning/AI/Statistics evoking a robot hand drawing a best fit line through some points on graph paper with a pen.

What other funny, or interesting, metaphorical visual representations of machine learning have you seen? Drop a link in the comments and I’ll get my robot arm to compile the results.

August 26, 2012

By Corey Chivers ¶ Posted in Uncategorized ¶ Leave a comment

~~The horror of left over variance…~~

After some discussion with Jeremy Fox, and a re-reading of Fox (2006), my understanding of the Price Equation partitioning of effects of biodiversity loss on ecosystem function has been improved. What I thought was a term for residual variance is the context dependent effect. This is analogous to transmission bias in the evolutionary interpretation of the Price Equation.

August 26, 2012

Walmart Invasion

By Corey Chivers ¶ Posted in Rstats ¶ Tagged diffusion, dispersal, epidemic, growth, invasive species, rstats, spread, visualization, walmart ¶ 16 Comments

As an invasion biologist, the process of spatial spread is at the heart of what I do. When I came across this dataset of Walmart store openings since 1962 I couldn’t help but see it as an invasion front which looks a lot like a biological invasion or (albeit slow) epidemic.

The video shows monthly store openings (in red) between 1962 and 2006.

You can get the code I wrote for generating this visualization on my github page.

August 17, 2012

An update on visualizing Bayesian updating

By Corey Chivers ¶ Posted in Probability, Rstats ¶ Tagged bayesian updating, Bernoulli, bernoulli trials, beta distribution, priors, visualizatin ¶ 8 Comments

A while ago I wrote this post with some R code to visualize the updating of a beta distribution as the outcome of Bernoulli trials are observed. The code provided a single plot of this process, with all the curves overlayed on top of one another. Then John Myles White (co-author of Machine Learning for Hackers) piped up on twitter and said that he’d like to see it as an animation. Challenge accepted – and with an additional twist.

The video shows how two observers who approach the problem with different beliefs (priors) converge toward the same conclusion about the value of the unknown parameter after making enough observations.

I’ve posted the code here.

August 13, 2012

The essence of a handwritten digit

By Corey Chivers ¶ Posted in Rstats ¶ Tagged kaggle, machine learning, optical character recognition, visualization ¶ 3 Comments

If you haven’t yet discovered the competitive machine learning site kaggle.com, please do so now. I’ll wait.

Great – so, you checked it out, fell in love and have made it back. I recently downloaded the data for the getting started competition. It consists of 42000 labelled images (28×28) of hand written digits 0-9. The competition is a straight forward supervised learning problem of OCR (Optical Character Recognition). There are two sample R scripts on the site to get you started. They implement the k-nearest neighbours and Random Forest algorithms.

I wanted to get started by visualizing all of the training data by rendering some sort of an average of each character. Visualizing the data is a great first step to developing a model. Here’s how I did it:

## Read in data
train <- read.csv("../data/train.csv", header=TRUE)
train<-as.matrix(train)

##Color ramp def.
colors<-c('white','black')
cus_col<-colorRampPalette(colors=colors)

## Plot the average image of each digit
par(mfrow=c(4,3),pty='s',mar=c(1,1,1,1),xaxt='n',yaxt='n')
all_img<-array(dim=c(10,28*28))
for(di in 0:9)
{
print(di)
all_img[di+1,]<-apply(train[train[,1]==di,-1],2,sum)
all_img[di+1,]<-all_img[di+1,]/max(all_img[di+1,])*255

z<-array(all_img[di+1,],dim=c(28,28))
z<-z[,28:1] ##right side up
image(1:28,1:28,z,main=di,col=cus_col(256))
}

Which gives you:
Notice the wobbly looking ‘1’. You can see that there is some variance in the angle of the slant, with a tenancy toward leaning right. I imagine that this is due to the bias toward right handed individuals in the sample.

I also wanted to generate a pdf plot of all of the training set, to get myself an idea of what kind of anomalous instances I should expect.

If you are interested, dear reader, here is my code to do just that.

pdf('train_letters.pdf')
par(mfrow=c(4,4),pty='s',mar=c(3,3,3,3),xaxt='n',yaxt='n')
for(i in 1:nrow(train))
{
z<-array(train[i,-1],dim=c(28,28))
z<-z[,28:1] ##right side up
image(1:28,1:28,z,main=train[i,1],col=cus_col(256))
print(i)
}
dev.off()

Which will give you a 2625 page pdf of every character in the training set which you can, um, casually peruse.
As of the time of writing, the current leading submission has a classification accuracy of 99.27%. There is no cash for this competition, but the knowledge gained from taking a stab at it is priceless. So give it a shot!

August 10, 2012

Simulation: The modeller’s laboratory

By Corey Chivers ¶ Posted in Probability, Rstats ¶ Tagged experiment, gravity model, invasive species, model selection, random utility model, simulation, spread ¶ 6 Comments

In his 2004 paper in Trends in Ecology and Evolution, Steven Peck argues:

Simulation models can be used to mimic complex systems, but unlike nature, can be manipulated in ways that would be impossible, too costly or unethical to do in natural systems. Simulation can add to theory development and testing, can offer hypotheses about the way the world works and can give guidance as to which data are most important to gather experimentally.

A sentiment I agree with fully. However, another important use of simulation is in the experimentation phase of model development before confronting models with data. In ecology, epidemiology and related fields, it is common to have observational data (as opposed to controlled, randomized experiments). In these situations, two questions need to be asked:

What can we observe about the system? (ie what will the data look like?)
Given what we can observe (1), will our model(s) be able to capture the underlying process and parameters?

In order to answer these questions, we need to be able to simulate the hypothesized processes and use the simulated observations to fit our model(s). This process has the additional benefit of forcing us to understand the process that we are modelling. In fact, I find that the act of formalizing the hypothesized process into a coded simulation makes the formulation of the likelihood function more straightforward.

Let’s look at an example. In a recent paper in the Journal of Applied Ecology, we model the trip taking behaviour of recreational boaters in Ontario. Through a survey, we observed the trip outcomes of a sample of boaters. We wanted to compare two models of the trip taking process (see the paper for model descriptions).


###############################
##### RUM v GM sim tests ######
##
## Corey Chivers, 2012
##
###############################

n_boaters<-50
n_lakes<-750
n_trips<-rpois(n_boaters,16)

euclidian<-function(from_x,from_y,to_x,to_y)
{
return( sqrt( (from_x-to_x)^2 + (from_y-to_y)^2 ) )
}
sim_boaters<-function(N=n_boaters)
{
##Home locations
return(as.matrix(cbind(runif(N,0,100),runif(N,0,100)) ))
}
sim_lakes<-function(N=n_lakes)
{
x<-cbind(runif(N,0,100))
y<-cbind(runif(N,0,100))
size<-abs(rnorm(N,x+y,x+y))
return(cbind(x,y,size))
}
get_d_mat<-function()
{
## matrix of boaterxlake distances
d_mat<-array(dim=c(n_boaters,n_lakes))
for(b in 1:n_boaters)
{
d_mat[b,]<-euclidian( boater[b,1],boater[b,2],lakes[,1],lakes[,2] )
}
return(d_mat)
}
sim_trips<-function(par,M)
{
trips<-list()
for(b in 1:n_boaters)
{
P<-M(par,b)
trips[[b]]<-sample(1:n_lakes,n_trips[b],p=P,replace=TRUE)
}
return(trips)
}

RUM<-function(par,b) ## par{b1,b2}
{
exp_sum_V<-sum(exp(par[1]*lakes[,3]+par[2]*d_mat[b,]) )
V<-exp(par[1]*lakes[,3]+par[2]*d_mat[b,])
p_j<-V/exp_sum_V
return(p_j)
}
GM<-function(par,b) ## par{e,d}
{
Ai<-sum(lakes[,3]^(par[1]) * d_mat[b,]^(-par[2]))
WD<-lakes[,3]^(par[1]) * d_mat[b,]^(-par[2])
p_j<-WD/Ai
return(p_j)
}
ll<-function(par,M,data,give_neg=-1)
{
ll<-0
ll<-sapply(1:n_boaters,function(b){
P_b<-M(par,b)
l_tmp<-0
for(t in 1:length(data[[b]]))
l_tmp<-l_tmp+log(P_b[ data[[b]][t] ] )
return(l_tmp)
})
return(give_neg*sum(ll))
}
dAIC<-function(ll)
{
delta_aic<-numeric(length(ll[,1]))
for(i in 1:length(ll[,1]))
{
delta_aic[i]<-(2*ll[i,2]+(2*n_par))-(2*ll[i,1]+(2*n_par))
}
return(delta_aic)
}
plot_trips<-function(col='green',add_trips=TRUE,pdffile=NULL)
{
if(!is.null(pdffile))
pdf(pdffile)

par(mgp=c(0.5,0,0),cex=1.2,mfrow=c(1,2),mar=c(1,2,1,0.5),pty='s')
col=rgb(0,0,1,0.05)
plot(lakes[,1],lakes[,2],cex=lakes[,3]/200, xlab='Lon', ylab='Lat',xaxt='n',yaxt='n')
if(add_trips)
{
for(i in 1:n_boaters)
{
for(tr in 1:length(trips[[i]]))
{
segments(boater[i,1],boater[i,2],lakes[trips[[i]],1],lakes[trips[[i]],2],col=col,lwd=2)
}
points(boater[i,1],boater[i,2],pch=15,col='blue',cex=0.5)
}
}

### Make a legend ###
plot(1,1,col='white', xlab='', ylab='',xaxt='n',yaxt='n',xlim=c(0,1),ylim=c(0,1))
yloc<-c(0.9,0.8,0.7,0.6)
n_tr<-c(5,10,15,20)
for(i in 1:4)
{
for(n in 1:n_tr[i])
segments(0.1,yloc[i],0.5,yloc[i],col=col,lwd=3)
text(0.6,yloc[i],n_tr[i])
}
text(0.82,0.75,"Number\nof trips")

yloc<-c(0.4,0.3,0.2,0.1)
l_size<-c(200,400,600,800)
for(i in 1:4)
{
for(n in 1:n_tr[i])
points(0.3,yloc[i],cex=l_size[i]/200)
text(0.5,yloc[i],l_size[i])
}
text(0.7,0.25,"Size\nof lake")

if(!is.null(pdffile))
dev.off()
}

With all of the relevant functions defined, we then simulate behaviour under each model, fit to each model and compare our ability to both capture the generating parameter values and to distinguish between the competing models.

boater<-sim_boaters()
lakes<-sim_lakes()
d_mat<-get_d_mat()

n_sims<-1000
n_par<-2

save_par<-array(dim=c(n_sims,2*n_par))
ll_save<-array(dim=c(n_sims,2))

### RUM GENERATING ###
for(i in 1:n_sims)
{
lakes<-sim_lakes()
d_mat<-get_d_mat()
## pars{b1,b2}
pars<-c(runif(1,0.01,0.1),runif(1,-10.5,-1.5))
trips<-sim_trips(pars,RUM)
opt<-optim(pars,ll,M=RUM,data=trips)

save_par[i,]<-c(pars,opt$par)

pars<-c(0.5,1,0,0) #seeds
opt2<-optim(pars,ll,M=GM,data=trips)

ll_save[i,]<-c(opt$value,opt2$value)
print(i)
}
par(mfrow=c(1,2),pty='s')
plot(ll_save)
abline(0,1,lty=2)

hist(dAIC(ll_save))
ll_saveRUM<-ll_save
save_parRUM<-save_par

par(mfrow=c(2,2),pty='s')
param_names<-c('B1','B2')
for(i in 1:n_par)
{
plot(save_parRUM[,i],save_parRUM[,i+n_par],xlab=paste('Generating ',param_names[i]),ylab=paste('Fit ',param_names[i]))
abline(0,1)
}

### GM GENERATING ###
for(i in 1:n_sims)
{
lakes<-sim_lakes()
d_mat<-get_d_mat()
## par{e,d}
e<-runif(1,0.1,0.9)
d<-runif(1,1,5)
pars<-c(e,d)
trips<-sim_trips(pars,GM)
opt<-optim(pars,ll,M=GM,data=trips)

save_par[i,]<-c(pars,opt$par)

pars<-c(0.001,-2) #seeds
opt2<-optim(pars,ll,M=RUM,data=trips)

ll_save[i,]<-c(opt$value,opt2$value)
print(i)
}

x11()
par(mfrow=c(1,2),pty='s')
plot(ll_save)
abline(0,1,lty=2)

hist(dAIC(ll_save))
ll_saveGM<-ll_save
save_parGM<-save_par

par(mfrow=c(2,2),pty='s')
param_names<-c('e','d')
for(i in 1:n_par)
{
plot(save_parGM[,i],save_parGM[,i+n_par],xlab=paste('Generating ',param_names[i]),ylab=paste('Fit ',param_names[i]))
abline(0,1)
}

### Plot deltaAIC distributions ###
par(mfrow=c(1,2),pty='s')
hist(dAIC(ll_saveGM),breaks=30)
hist(dAIC(ll_saveRUM),breaks=30)

plot_trips()

After verifying that we can both distinguish between the two models, and can recapture the generating parameters, we then use simulation, in the manner described by Peck (2004), to analyse the implications of using alternative models of boater behaviour on the spread of invasive species.

Insights derived from simulation need to be complimented with controlled, randomized experiments. However, the use of simulation as a laboratory for the statistical modeller is indispensable, and you can do it all without putting a white coat on – unless you’re into that kind of thing.

August 1, 2012

Olympic body match and 1:1 BMI

By Corey Chivers ¶ Posted in Rstats ¶ Tagged BMI, olympic body match ¶ 1 Comment

In my morning attempt to read the whole internet before beginning work, I came across a program on the BBC website which allows you to see which Olympic athletes are your body doubles. Or rather, which athletes share your height and weight, and therefore your body mass index. Being a Canadian, I exist in an uneasy hybrid world of measurements, comfortable with metric units for some things and imperial for others. I know my height in cm (185), and my weight in lbs (185). This makes my Olympic ‘body double’ Elizabeth Gleadle, the Canadian javelin thrower.

Apart from discovering that I may have a body type for women’s competitive javelin, I was intrigued by the symmetry of height(cm) = weight(lbs). Over what range is this equivalence healthy? A quick googling of BMI tells me that ‘normal’ range of BMI is from 18.5 to 25. I come in at 24.5; well fed, to be sure, but in the safe zone.

Scratching this curiosity itch is nothing that can’t be handled by a quick R script. Looks like you can carry the height = weight equivalence, adding a pound for each cm of height, all the way up to about 8 feet tall (~245lbs) before you’re underweight, according to BMI.

Okay, now I should get to work.

Code for the curious.

lowerH<-100
upperH<-162.5+(5*21.5)

hw<-seq(lowerH,upperH,length.out=100)
BMI<-numeric(100)
counter=1
for(i in hw)
{
h<-i
w<-i
BMI[counter]<-(w/2.205)/(1/100*h)^2
counter=counter+1
}

png('BMI_sameWH.png')
plot(hw,BMI,type='l',xlab='Height(cm) = Weight(lbs)',main='Same height as weight BMI')
polygon(x=c(90,300,300,90),y=c(50,50,30,30),col=colors()[325])
polygon(x=c(90,300,300,90),y=c(30,30,25,25),col=colors()[330])
polygon(x=c(90,300,300,90),y=c(25,25,18.5,18.5),col=colors()[235])
polygon(x=c(90,300,300,90),y=c(18.5,18.5,0,0),col=colors()[240])
lines(hw,BMI,lwd=2,col='red')
abline(h=18.5)
abline(h=25)
abline(h=30)
text(250,37.5,label='Obesity')
text(250,27.5,label='Overweight')
text(150,21,label='Normal')
text(150,16.75,label='Underweight')
points(185,(185/2.205)/(1/100*185)^2,pch=8,cex=2,col='blue')
legend('topright',legend="The sweet spot\n 185cm, 185lbs\n", pch=8,col='blue',bg='white')
dev.off()
#underweight = <18.5
#Normal weight = 18.5–24.9
#Overweight = 25–29.9

July 13, 2012

Dynamical systems: Mapping chaos with R

By Corey Chivers ¶ Posted in Rstats, Teaching ¶ Tagged bifurcation, chaos, cobweb plot, determinism, dynamical systems, ecology, logistic map, numerical methods ¶ 2 Comments

Chaos. Hectic, seemingly unpredictable, complex dynamics. In a word: fun. I usually stick to the warm and fuzzy world of stochasticity and probability distributions, but this post will be (almost) entirely devoid of randomness. While chaotic dynamics are entirely deterministic, their sensitivity to initial conditions can trick the observer into seeing iid.

In ecology, chaotic dynamics can emerge from a very simple model of population.

Where the population in time-step t+1 is dependent on the population at time step t, and some intrinsic rate of growth, r. This is known as the logistic (or quadratic) map. For any starting value of x at t₀, the entire evolution of the system can be computed exactly. However, there some values of r for which the system will diverge substantially with even a very slight change in the initial position.

We can see the behaviour of this model by simply plotting the time series of population sizes. Another, and particularly instructive way of visualizing the dynamics, is through the use of a cobweb plot. In this representation, we can see how the population x at time t maps to population x at time t+1 by reflecting through the 1:1 line. Each representation is plotted here:

You can plot realizations of the system using the following R script.


q_map<-function(r=1,x_o=runif(1,0,1),N=100,burn_in=0,...)
{
par(mfrow=c(2,1),mar=c(4,4,1,2),lwd=2)
############# Trace #############
x<-array(dim=N)
x[1]<-x_o
for(i in 2:N)
x[i]<-r*x[i-1]*(1-x[i-1])

plot(x[(burn_in+1):N],type='l',xlab='t',ylab='x',...)
#################################

##########  Quadradic Map ########
x<-seq(from=0,to=1,length.out=100)
x_np1<-array(dim=100)
for(i in 1:length(x))
x_np1[i]<-r*x[i]*(1-x[i])

plot(x,x_np1,type='l',xlab=expression(x[t]),ylab=expression(x[t+1]))
abline(0,1)

start=x_o
vert=FALSE
lines(x=c(start,start),y=c(0,r*start*(1-start)) )
for(i in 1:(2*N))
{
if(vert)
{
lines(x=c(start,start),y=c(start,r*start*(1-start)) )
vert=FALSE
}
else
{
lines(x=c(start,
r*start*(1-start)),
y=c(r*start*(1-start),
r*start*(1-start)) )
vert=TRUE
start=r*start*(1-start)
}
}
#################################
}

To use, simply call the function with any value of r, and a starting position between 0 an 1.


q_map(r=3.84,x_o=0.4)

Fun right?

Now that you’ve tried a few different values of r at a few starting positions, it’s time to look a little closer at what ranges of r values produce chaotic behaviour, which result in stable orbits, and which lead to dampening oscillations toward fixed points. There is a rigorous mathematics behind this kind of analysis of dynamic systems, but we’re just going to do some numerical experimentation using trusty R and a bit of cpu time.

To do this, we’ll need to iterate across a range of r values, and at each one start a dynamical system with a random starting point (told you there would be some randomness in this post). After some large number of time-steps, we’ll record where the system ended up. Plotting the results, we can see a series of period doubling (2,4,8, etc) bifurcations interspersed with regions of chaotic behaviour.

library(parallel)
bifurcation<-function(from=3,to=4,res=500,
x_o=runif(1,0,1),N=500,reps=500,cores=4)
{
r_s<-seq(from=from,to=to,length.out=res)
r<-numeric(res*reps)
for(i in 1:res)
r[((i-1)*reps+1):(i*reps)]<-r_s[i]

x<-array(dim=N)

iterate<-mclapply(1:(res*reps),
mc.cores=cores,
function(k){
x[1]<-runif(1,0,1)
for(i in 2:N)
x[i]<-r[k]*x[i-1]*(1-x[i-1])

return(x[N])
})

plot(r,iterate,pch=15,cex=0.1)

return(cbind(r,iterate))
}

#warning: Even in parallel with 4 cores, this is by no means fast code!
bi<-bifurcation()
png('chaos.png',width=1000,height=850)
par(bg='black',col='green',col.main='green',cex=1)
plot(bi,col='green',xlab='R',ylab='n --> inf',main='',pch=15,cex=0.2)
dev.off()

This plot is known as a bifurcation diagram and is likely a familiar sight.

Hopefully working through the R code and running it yourself will help you interpret cobweb plots, as well as bifurcation diagrams. It is really quite amazing how the simple looking logistic map equation can lead to such interesting behaviour.

bayesianbiologist

Corey Chivers on P(A|B) ∝P(B|A)P(A)