Simulating polymers as self-avoiding random walks¶

Background: polymers and how to characterize them¶

Polymers are large molecules composed of several repeating subunits (the monomers). They play an important role in our everyday life, ranging from biological processes to synthetic materials (see Polymers on Wikipedia for more background).

In this project we want to simulate polymers that are linear and unbranched, and that are immersed in a good solvent. The latter condition of being in a good solvent means that different polymers are separated from each other and will not interact. We can thus consider the physics of a single polymer alone.

One particular property of a polymer is its lack of rigidity: in general polymers are rather floppy. We will not describe this from a microscopic model (such as quantum chemistry), but we will instead use a more simplistic toy model described below. Before we do that, let us however first consider how we can quantitatively describe a polymer.

We can assign the $i$ -th subunit of the polymer a coordinate vector $\mathbf{r}_i$ . A linear polymer consisting of $L+1$ subunits is then described by a ordered set of points $\{\mathbf{r}_0, \mathbf{r}_1, \dots, \mathbf{r}_L\}$ . Note that there are $L+1$ subunits, but $L$ bonds between the subunits.

We can characterize the polymer by its end-to-end distance: $r_e^2(L) = \left|\mathbf{r}_L - \mathbf{r}_0\right|^2\,.$ Alternatively, we can use the radius of gyration that is given by $r_g^2(L) = \frac{1}{L+1} \sum_{i=0}^L \left|\mathbf{r}_i - \mathbf{r}_\text{cm}\right|^2\,,$ where $\mathbf{r}_\text{cm} = \frac{1}{L+1} \sum_{i=0}^L \mathbf{r}_i$ is the center of mass of the polymer. The radius of gyration describes how "curled up" the polymer is.

Model: Polymers as a self-avoiding random walk on a lattice¶

We will now give a complete description of the polymer toy model. We will continue to describe a polymer by an ordered set of points $\{\mathbf{r}_0, \mathbf{r}_1, \dots, \mathbf{r}_L\}$ . In a polymer, the distance between subunits is determined by the length of the chemical bonds, and we will thus assume a fixed bond length $a$ between subunits $\left|\mathbf{r}_{i+1} - \mathbf{r}_i\right| = a$ . In contrast, for many polymers the angle between bonds is far less restricted, and we use this as the degree of freedom that determines the shape of the polymer.

To simplify the model even further, we will assume that the angle between different bonds is restricted to be a multiple of $90^{\circ}$ . In doing so, we restrict the polymer points $\mathbf{r}_i$ to be on a square lattice with lattice constant $a$ . Without restricting generality, we can set $a=1$ .

self-avoiding random walk

In the above example, you may already see the last ingredient for our toy model: Since a polymer has a finite extent in space, two subunits cannot come too close. In our lattice model we implement this by demanding that one lattice point cannot be occupied by two different subunits. In other words: $\mathbf{r}_i \neq \mathbf{r}_j$ for all $i \neq j$ .

With these assumptions, our polymer model becomes a self-avoiding random walk on a square lattice.

For such a self-avoiding random walk, the end-to-end distance and the radius of gyration are known to obey a scaling law: $\langle r^2_e(L)\rangle \sim L^{2\nu}\,$ $\langle r^2_g(L)\rangle \sim L^{2\nu}\,$ where $\nu = 3/4$ for 2D, and $\nu\approx 3/5$ in 3D.

The goal of this project is to reproduce the findings (for 2D).

The behavior of the self-avoiding random walk should be contrasted to the free random walk where both backtracking (the random walk takes a step back: $\mathbf{r}_{i+2} = \mathbf{r}_i$ ) and intersections are allowed:

free random walk

For the free random walk the scaling behavior is well-known and we have $\nu = 1/2$ .

Why sampling polymers is hard¶

In principle, our problem looks easy, as we have a very simple model, where every self-avoiding random walk of length $L$ has the same probability.

We could actually attempt to do a complete sampling of all possible polymers of a given length, and in this way compute the ensemble average exactly. To this end, we could generate all possible free random walks of length $L$ without back-tracking and throw away all intersecting walks. This however becomes exceedingly difficult very fast: At every step of the free random walk we have 3 possible directions, hence there are $3^L$ possible free random walks. For example, even for rather short walks like $L=20$ this gives already $3^{20} \approx 3 \times 10^9$ possibilities.

It would thus be desirable to sample in the spirit of the Monte Carlo method only a small set of self-avoiding random walks, and use these to estimate the end-to-end distance/radius of gyration.

The Rosenbluth method¶

Description of the algorithm¶

The first such sampling method was developed by Rosenbluth and Rosenbluth in 1955.

Apart from the numerical method we will describe in a moment, this paper also contains a noteworthy overview of previous attempts, such as:

To the best of the authors' knowledge the first numerical calculations of chain length were made by Dr. Ei Teramoto of Kyoto University, who performed the remarkable feat of evaluating chain lengths in the two-dimensional case up to N = 20 by a hand calculation cataloging all possible chains.

You don't have to do this for the final report ;-)

The Rosenbluth method generates $N$ different polymers/self-avoiding random walks independently from each other, by growing every polymer individually. The polymer is grown successively by adding a new subunit $\mathbf{r}_i$ to the end of the polymer, choosing randomly one of the $m_i$ unoccupied lattice sites adjacent to $\mathbf{r}_{i-1}$ . You can then encounter different situations:

adding a new subunit to a polymer

Note that in the latter case $m_i=0$ no new subunit can be added, and the polymer sampling cannot continue beyond a certain length.

This process is repeated $N$ times. For each polymer $k$ (with $k$ denoting the polymer index, $k=1,\dots, N$ ) we record all positions $\{\mathbf{r}_{k, 1}, \dots, \mathbf{r}_{k, L}\}$ (we can let all polymers start from the same initial point $\mathbf{r}_{k, 0}$ ). In addition, we record for every polymer the weight $w_k^{(L)} = \prod_{i=1}^L m_{k, i}\,, \tag{1}$ where $m_{k, i}$ is the number of available positions to place the new subunit $\mathbf{r}_i$ while growing polymer $k$ .

The average end-to-end distance or radius of gyration are then given by the weighted average $\langle r^2(L) \rangle = \frac{ \sum_{k=1}^N w_k^{(L)} r^2_{k}(L)}{\sum_{k=1}^N w_k^{(L)}} \tag{2}$ where $r_k(L)$ is the value of the desired observable computed from polymer $k$ .

Let us make some observations here:

When generating $N$ polymers of length $L$ (disregarding the case where the polymer got stuck in the growing process), we also generate $N$ polymers of all lengths $<L$ . Hence, if we keep track of the weights at every growth step, we can compute from one simulation the complete length dependence of the end-to-end distance or the radius of gyration (up to length $L$ ).
When a polymer gets stuck, you will end up with less than $N$ polymers beyond a certain length. Eq. (2) then still works: either you exclude these polymers from the sum, or you include them, but with weight 0 - both approaches give the same result.

Why the weighted average¶

The remarkable difference to other Monte Carlo methods is that the average in Eq. (2) is a weighted average. So where do these weights come from?

To answer this question, we have to compare the probability distribution with which we actually sample polymers, $P_\text{sample}$ , to the probability distribution with which we should sample, $P_\text{real}$ .

The probability of a polymer in the Rosenbluth method is given by $P_\text{sample}(\mathbf{r}_{k,1}, \dots, \mathbf{r}_{k,L}) = \prod_{i=1}^L \frac{1}{m_{k, i}} = \frac{1}{w_k^{(L)}}\,,$ as evident from the growth procedure. However, in our model we said every polymer should have the same probability, and the correct probability would be given as $P_\text{real}(\mathbf{r}_1, \dots, \mathbf{r}_L) = \frac{1}{T_L}\,$ where $T_L$ is the total number of possible polymers with length $L$ . These two probabilities differ: growing the polymer does not make them with equal probability, but biases them as adding a new subunit depends on the previous steps.

So we are sampling with a distribution that only approximates the real probability distribution. This an example of approximate importance sampling, and accordingly we find that the average is given as: $\langle r^2(L)\rangle \approx \frac{1}{N} \sum_{k=1}^N \frac{P_\text{real}}{P_\text{sample}} r_k^2(L) = \frac{1}{N}\sum_{k=1}^N \frac{w_k^{(L)}}{T_L} r_k^2(L)= \frac{1}{N} \frac{1}{T_L} \sum_{k=1}^N w_k^{(L)} r_k^2(L) \tag{3}$ However, we still don't know $T_L$ exactly - we need to estimate it. To this end we can make use of a rather trivial exact identity: $T_L = \sum_\text{all SAW(L)} 1 = \sum_\text{all SAW(L)} P_\text{sample} \frac{1}{P_\text{sample}}$ where "sum over all SAW(L)" now denotes the sum over all possible self-avoiding walks of length $L$ . As previously mentioned we cannot do this sum exactly. However, from the last identity, we see that $T_L$ is also equal to the expection value of $\frac{1}{P_\text{sample}}$ when we sample according to $P_\text{sample}$ . So in other words, it is the expectation value of $\frac{1}{P_\text{sample}}$ computed from the paths we generate with our Rosenbluth method. Hence, we can estimate it with our Monte Carlo method as: $T_L \approx \frac{1}{N} \sum_{k=1}^N w_k^{(L)} \tag{4}$ Plugging (4) into (3) then yields the weighted average from Eq. (2): $\langle r^2(L) \rangle = \frac{ \sum_{k=1}^N w_k^{(L)} r^2_{k}(L)}{\sum_{k=1}^N w_k^{(L)}}\,.$

A more detailed analysis, including a discussion of biases that goes beyond the lowest order approximation used here, can be found in F. L. McCrackin, J. Res. NIST 76B, 193 (1972).

How to compute the error of a weighted average¶

Now that we have understood the origin of the weighted average, we turn to how to estimate the error of this average. What makes things easier for us is that all polymers of a given length are created independently - we thus do not need to take care of correlations.

Bootstrapping¶

The bootstrap technique is very general, and indeed we can also apply it to this situation. Let us show in detail how it works for this specific case:

We have $N$ polymers with associated weights $w_k^{(L)}$ . This is the our sample $S$ . With the bootstrap technique we now generate a new sample $S_1$ of the same size by chosing random polymers and their associated weight with replacement, i.e. some polymers may be chosen several times, and some not at all. From this new sample $S_1$ we then compute the desired end-to-end distance or radius of gyration $\langle r^2(L)\rangle_1$ . We repeat this $n$ times to obtain a set of different expectation values $\langle r^2(L)\rangle_1, \dots, \langle r^2(L)\rangle_n$ . From these we can estimate the error as $s(\langle r^2(L)\rangle) =\sqrt{\frac{1}{n} \sum_{l=1}^n \left(\langle r^2(L)\rangle_i\right)^2 - \left(\frac{1}{n} \sum_{l=1}^n \langle r^2(L)\rangle_i\right)^2}$ which simply is the estimated standard deviation of the set of $\langle r^2(L)\rangle_1, \dots, \langle r^2(L)\rangle_n$ .

Approximate analytical formulas¶

Apart from the numerical bootstrap procedure , there are also approximate analytical formulas. When computing the error of the estimate in Eq. (2) we must remember that both $w_k$ and $r_k^2$ are random variables. Following this stackexchange post the best approximation comes from theory that deals with approximating the error of the expectation value of a ratio of random variables. In a compact form (equivalent to the formula in stackexchange post), the error can be written as $s(\langle r^2(L)\rangle) = \sqrt{\frac{N}{(N-1)} \frac{\sum_{k=1}^N \left(w_k^{(L)}\right)^2 \left(r^2_k(L) - \langle r^2(L)\rangle\right)^2}{\left(\sum_{k=1}^N w_k^{(L)}\right)^2} }\tag{5}$ Note that this is an approximation (derived for example in W.G. Cochran, Sampling Techniques). Gatz and Smith compared various approximate formulas to the result of bootstrapping for some weather data and found that Eq. (5) gave the best agreement. We confirmed this also for the polymer project.

The problem with Rosenbluth¶

As for all approximate importance sampling techniques, the Rosenbluth method runs into a problem for longer polymers (i.e. larger configuration space). In this case you will find largely different weights $w_k$ for the polymers, with a few weights dominating the weighted average Eq. (2). Effectively, the number of significant polymers is greatly reduced, significantly enhancing the error.

Pruned-enriched Rosenbluth method (PERM)¶

Description of the method¶

To overcome this imbalance in weights, Grassberger introduced a pruning and enriching steps to the Rosenbluth method to improve balance.

In particular, after adding a subunit to a polymer such that the current length is $L'$ , the following two steps are performed for every polymer $k$ :

Pruning: If the weight , one of the following two actions is performed. Which of the two is performed is chosen randomly with equal probability (i.e. (a) with probability and (b) with probability )
1. the polymer $k$ is discarded.
  That means the current polymer length $L'$ is discarded and not used to grow further. However, the previous lengths of the polymer $k$ are still used for the averages for length $<L'$ .
2. The weight $w_k^{(L')}$ is doubled: $w_k^{(L'), \text{new}} = 2 w_k^{(L')}$ .
  This new weight is used to compute the average for length $L'$ and used to compute the weight of the polymer when grown further: e.g. if in the next step there are $m_{k, L'+1}$ possibilities to place the subunit, $w_k^{(L'+1)} = m_{k, L'+1}\, w_k^{(L'), \text{new}}$ , etc.
Enrichment: If the weight $w_k^{(L')} > W_+^{(L')}$ , the polymer is copied, i.e. a second copy of this polymer is added for length $L'$ . The original and the copy are assigned half of the original weight. These new weights are then used for all computations of averages of length $L'$ and larger.

Why does PERM work?¶

The pruning and enrichment steps are designed such that, regardless of the values of $W_-^{(L')}$ and $W_+^{(L')}$ , the weighted average (2) does not change on average: copying the polymer halfs the weight, and discarding half of a set of polymers doubles the weight of the other half. Hence, the pruning and enriching steps do not change the results, showing the correctness of the approach.

Obviously, PERM aims to improve on the Rosenbluth method by decreasing the number of polymers with low weight (that don't contribute much to the average although they are as expensive in computing time), and at the same time enhancing the number of promising polymers. The degree to which this is done is controlled by $W_-^{(L')}$ and $W_+^{(L')}$ .

When choosing these values, it is important to realize that there is a balance. Pruning and enriching too little gives little improvement over Rosenbluth. But doing it too aggressively also is problematic! Then most polymers in the simulation are generated from a few original specimens, so that all of them become highly correlated, increasing the error in the simulation result again (we then loose the advantage of Rosenbluth that guaranteed statistically independent polymers in our sampling).

Grassberger suggests in his paper to choose $W_+^{(L')} = c_+ \bar{W}^{(L')} \text{ and } W_-^{(L')} = c_- \bar{W}^{(L')}$ where $\bar{W}^{(L')}$ is the average weight of the polymers at length $L'$ . He mentions $c_+/c_- \approx 10$ being a good choice, but claims that the method is not very sensitive to the exact choice.

Calculating the error of the computed mean¶

With PERM, we are undoubtably introducing correlations into the system. Unfortunately, there does not seem an easy way to account for this when computing the error of the average. If pruning and enriching is not overdone, most of the polymers will still be uncorrelated. From a practical perspective it thus seems to be justified to use the methods for estimating the error of the average computed with the Rosenbluth method