Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Interesting Numbers

CrazedWalrus (901897) writes | more than 5 years ago

User Journal 3

I've never been a particularly normal person, and I guess my hobbies are a reflection of that. I'm a programmer by trade, but I've recently taken an interest in historical statistics in light of the current financial crisis. I'll say up front that stats were never my strong point, so feel free to tell me if you think I'm doing anything out of line here, but I think it's pretty straightforward.

I've never been a particularly normal person, and I guess my hobbies are a reflection of that. I'm a programmer by trade, but I've recently taken an interest in historical statistics in light of the current financial crisis. I'll say up front that stats were never my strong point, so feel free to tell me if you think I'm doing anything out of line here, but I think it's pretty straightforward.

Last month on a lark I went to the Dow Jones web site and downloaded the historical Dow Jones Industrial Averages by day. After some minor massaging of the formatting, I loaded them up in Postgres and started poking around looking for similar events to put the current financial "crisis" into perspective.

One interesting query I hit upon is this:

select stats.*
        , prev.*
        , case
                when stats.avg_value > prev.avg_value then 'UP'
                else 'DOWN'
            end as direction
        , (stats.avg_value - prev.avg_value)/stats.avg_value * 100 as pct_change
from (
        select securityid
                , extract(year from valuetime)::int as year
                , avg(value)::numeric(20,2) as avg_value
                , stddev(value)::numeric(20,2) as stddev
                , (stddev(value) / avg(value) * 100)::numeric(20,2) as pct_dev
        from exchange.history
        group by securityid, extract(year from valuetime)::int
) stats
        inner join
(
        select securityid
                , extract(year from valuetime)::int as year
                , avg(value)::numeric(20,2) as avg_value
                , stddev(value)::numeric(20,2) as stddev
                , (stddev(value) / avg(value) * 100)::numeric(20,2) as pct_dev
        from exchange.history
        group by securityid, extract(year from valuetime)::int
) prev
        on stats.year = prev.year + 1
where case when stats.avg_value > prev.avg_value then 'UP' else 'DOWN' end = 'DOWN'
order by (stats.avg_value - prev.avg_value)/stats.avg_value * 100 asc

That's probably not very interesting, except that you can see how I'm getting my numbers. The history table it's drawing on is a series of daily Dow closing values.

Some interesting points from that query:

1. 2008 was the 11th worst year since 1897 in terms of percentage decline from the previous year at 15% decline. Top ten years were, in order of suckiness:

1932
1931
1930
1938
1907
1921
1974
1903
1970
1900

2. Standard deviation of 2008 was 1558.46, versus 526.07 in 2007.
3. In terms of the ratio between stddev and avg value of the Dow, 2008 was 8th, with a standard deviation of 13.61% of average Dow value.

So in terms of overall suckage, 2008 ranks pretty high. But we knew that.

From that, you might be tempted to think the whole year was awful, but in reality, it simply wasn't. We had 4 bad months -- January, June, October and November. October was the worst of 2008, losing 21.1%, followed by June at -11.7% January at 6.9% and November at -6.7% changes.

That said, June was the "craziest" month with the largest swings in prices -- stddev was 2652.53 versus 684.27 for October and 496.80 for November.

In terms of history, October was the 4th worst decline (21.11%) since 1897, following:

November 1929 (-37.79%)
April 1932 (-29.98%)
December 1931 (-29.03%)

June comes in at 30th worst decline at 11.6%, January at 73rd worst monthly decline (6.9%), and November at 78th worst decline at 6.67%.

How this plays into history remains to be seen. However, 2008 was only about half of the percentage decline seen during the depression years. For 2008 to be analogous to the depression, we'd need to see about twice the yearly decline, sustained for several years. 2009-2011 will determine which direction the country goes. The Great Depression had these types of losses for 3 years, followed by more moderate yearly changes.

Several other big-loss years turned out to be anomalies, followed by normal years. One example is 1907. 1905 and 1906 were huge gainer years, climbing 32% and then 16% before crashing 24% in 1907. The very next year, the decline was 1.6% and then 19% growth in 1909, followed by normal good, bad, and "blah" years where nothing really significant happened on the Dow. There were even several banner years up until 1929. In almost every case, losing years were erased in fairly short order.

I guess the point is that the current "crisis" is not unprecedented by any means, and isn't even half as bad as several years in the past. We as a nation need to be very careful not to make fiscal policy and sacrifice civil liberties for the sake of recovering from events that have happened many times in our past. The market has booms and busts, but there's no need to panic and give up our values and way of life.

cancel ×

3 comments

Sorry! There are no comments related to the filter you selected.

With such a long time series (1)

tqft (619476) | more than 5 years ago | (#26305293)

Wouldn't your stdev comparison be skewed by the implict inflation in the numbers (or did I miss that in the code)? As the numbers in 2008 were are a lot larger than other years the changes (variance) are going to be bigger.

The % changes method seems reasonable but am wary of anything with absolute numbers over more than a few years.

While I have worked with statistics for a living - I hate them - easy to calculate and hard to interpret. Doesn't mean I don't use them - just that I have come close to some major screwups with statistics and their use by others.

You could try rebasing the series into inflation adjusted dollars and rerunning your analysis.

One test I like to do any analysis is find a way or interpretation that is wrong and see if that means the analysis framework is broken.

Re:With such a long time series (1)

CrazedWalrus (901897) | more than 5 years ago | (#26370463)

Thanks for taking the time to help me out. You're right -- inflation is a difficult problem to solve and does need to be taken into account.

I tried to get a handle on inflation using the stats.pct_dev column, which is the ratio of the stddev to the avg Dow for the year.

The idea is that I could sort based on the percentage stddev/avg, and that would show me years with the highest proportional volatility. So a 100-point swing in 1920 would have been disastrous, and would rank very highly, but a 100-point swing in 2008 would be toward the bottom of the list -- occurring almost daily.

That doesn't totally deal with the inflation problem, but it does help to put the numbers in better perspective. I have another query that I didn't post -- essentially the same, but where I ran the numbers grouped by month. I wrote about the results of that query when I was talking about how the different months rank with history. In this query, I don't think inflation would be much of an issue.

Either way, the effect of inflation on the rankings should be fairly well contained, because percentage change, stddev, and pct_dev (stddev/avg) were all calculated on yearly or monthly basis. The % change between years was only ever compared between consecutive years, and there could definitely be some inflationary skew there, but I'm thinking it's also generally pretty reasonable and doesn't make the comparison unfair.

Re:With such a long time series (1)

tqft (619476) | more than 5 years ago | (#26375787)

If the stdev's are important - option pricing even though Black-Scholes is almost always bad in my view - it may be worth the effort.

"essentially the same, but where I ran the numbers grouped by month. I wrote about the results of that query when I was talking about how the different months rank with history. In this query, I don't think inflation would be much of an issue.

Either way, the effect of inflation on the rankings should be fairly well contained"

Risk management time - if this for your entire net worth or a large % thereof or a major project for someone else, extra care is required. At least finding out what they are using it for so they don't end up trading on your assumptions.

Alternate query sounds good though. Particularly as it isn't extra work to get done.

If you get a radically different outcome you might want to work out why they are different - just to make sure you are looking at real information rather than computational artifacts. Been there, done that.

Like you said the % changes should be fairly immune.

One place I used to work used RSE (relative std error) as a measure of publichsed data control - RSE = stdev/average, they may be informative in this case.

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>