Scientific Models

July 25, 2016 3:10 pm

Something I think gets lost in scientific education is what a "model" truly is.  We blur the line between model and reality until we forget that a model is exactly that--a model.

This conflation of terms is understandable when we talk about things we experience in our day-to-day lives.  We say things like, "when I toss a ball it moves in an arc," and not, "we can model the motion of the ball using an arc."  We don't worry about whether the ball actually moves in an arc or not--that distinction isn't particularly meaningful.

But when we start learning about more complex phenomena the distinction between the model and reality can become very important. A model is simply a representation of some process or phenomenon that we observe.  An acceptable model will match the observed behavior in a consistent, coherent manner.  And a good model will allow us to make accurate predictions about future events.

The "accepted" model for an observed behavior tends towards the model that allows us to make the most accurate predictions.  Utility is the lifeblood of models.

But a model, even one that allows us to make very accurate predictions, may not tell us anything about what's really happening.

I stumbled upon an interesting example of this dichotomy between models and reality while reading Blind Watchers of the Sky.  Through the 16th century it was "known" that celestial bodies moved in circles.  This was an accepted fact because celestial bodies were created by God and uncorrupted by man, God is perfect, and circles are the perfect shape.  The celestial bodies clearly moved, so they must move in circles.  This was the dogmatically accepted model of the time.

With crude measurements the concept of the planets moving in perfect circles seemed to fit well enough.  But, eventually measurements got better and it became clear that they couldn't be moving in just simple circles.  Since the perfect aspect couldn't be challenged, the discrepancies were accounted for using epicycles (smaller circles moving along the larger circle) and other such complexity.

The model became more accurate, but was that truly how the heavens functioned?

Eventually Kepler proposed a radically different model.  But rather than what his new model was, the pertinent part is how he presented it.  He essentially said something like, "Hey everyone, look, we all know circles are perfect, and the heavens are perfect because God created them and God is perfect; I'm not saying anything otherwise.  However, I found out that if we model the motion of the planets using ellipses the calculation is easier and the results are more accurate!"

The argument wasn't whether the planets truly moved in circles or ellipses, that was a foregone conclusion at the time.   Instead it was the presentation of a model that allowed for more accurate predictions.  How the planets really moved didn't particularly matter.

--

An example I like to use to help separate the concepts of models from reality is this:

There is a thing on my desk.  I believe I can accurately predict how it will behave if I apply the model "spoon" to it.  I pick this object up and use it to eat soup.  My "spoon" model was accurate.  Jess now comes in the room and asks to use the item and proceeds to use this object in a way that no "spoon" I know of can be used.  She uses this thing to stab food and put it in her mouth.  The way she uses it would be better modeled by what I call a "fork."  So is it a spoon or is it a fork?

I could claim that sometimes it's a fork and sometimes it's a spoon.  That seems rather bizarre, yet it matches my observations.  Sometimes it acts like the things we call "spoons" and sometimes it acts like the things we call "forks."  But that doesn't mean it is both or that it transforms from one to the other.  It is what it is.  It only means that these models can both be useful in describing this thing depending on the circumstances.  However, a more accurate model is to realize it's something else entirely.  We need a new model we'll call "spork."

--

As one attempts to comprehend modern physics one is forced to separate "model" from "reality" if for no other reason than sanity.  When sub-sub-atomic particles were detected and their properties mapped scientists really just needed words to assign to these things they were observing that could only be indirectly observed.  So we ended up with terms like "spin" where nothing is really spinning, and "color" when nothing emits a visible-light wavelength, and other properties.  Then a fundamental set of 6 particles with various values for those properties was identified and needed names and we ended up with "Up", "Down", "Charm", "Strange", "Top", and "Bottom" quarks.

I don't know what reality really is, but we keep building models with greater and greater accuracy that enable us to better predict future events.  At some level the distinction becomes irrelevant, but that's only true right up until reality does something our model says is impossible.  And then, it's time for a new model.

Flow

July 15, 2016 10:06 am

Most people would understand what is meant by saying one is "in the zone."  The psychological concept goes by "flow."  Quoting from Wikipedia,

Flow is the mental state of operation in which a person performing an activity is fully immersed in a feeling of energized focus, full involvement, and enjoyment in the process of the activity. In essence, flow is characterized by complete absorption in what one does.

Mihály Csíkszentmihályi (yah, I have no idea how to pronounce that) wrote a book about flow and describes 6 required factors to achieve flow (borrowed from Wikipedia):

  1. Intense and focused concentration on the present moment
  2. Merging of action and awareness
  3. A loss of reflective self-consciousness
  4. A sense of personal control or agency over the situation or activity
  5. A distortion of temporal experience
  6. Experience of the activity as intrinsically rewarding

I think programming is uniquely suited to the creation of a state of flow.  When working on a programming problem I inherently become more intensely focused as more and more context is pulled in to my working memory.  All the bits and pieces have to be tracked and accounted for.  The more complex the problem the less room there is for extraneous thoughts.

This all-encompassing aspect of working memory feeds into point 2.  I'm not so much typing at a keyboard as I'm modifying the interconnected pieces of the software as they're held in my working memory.  The typing is more a way to capture the changes as I produce them in my head.  But I'm not thinking about the keyboard in any way, it may as well not exist.

As the construction of the software takes over my mental processes I'm forced into letting go a sense of self-consciousness.  There isn't any room in memory or processing power left to worry about it.  It is during these times that others might interrupt me to inform me that I'm whistling or tapping on my desk or some other thing that's bothering them.  I become completely unaware that I'm doing it.  Having a private office at work really helps in this regard.  When I must maintain self-conscious awareness to be courteous to those around me it inhibits the ability to enter a state of flow.

Programming is entirely about personal control over the situation.  It is the programmer's mind being melded with the limitations of the machine and language.  Once the keyboard and the monitor melt away as mere extensions of one's own thoughts and senses it is simply a matter of solving the problems and verifying the solutions.

The time distortion is one of the most fascinating aspects of flow.  While in flow I can work for hours on something and it will feel like just minutes.  Hunger disappears, emails are ignored, music is unheard.  What I've recently been able to observe is the "awakening" process that occurs at the end of flow.

A few weeks ago I was working on a new feature in one of our applications at work.  I had my browser up to test changes as I went, energetic music was playing, my text editor was opened up with all the needed files loaded.  I loaded all the relevant information into my head and began implementing the solution.

As my commits to version control piled up and the task was completed I was aware of the state of flow ending.  It's very much a let down.  Flow is a heightened state of awareness and efficacy.  Coming out of it feels something like rapidly becoming dumber.

As I came out I became aware of the music playing to the point that it became distracting and turned it off.  I glanced at my email inbox and wondered how that many emails had come in without me noticing.  I noticed the time and realized that it was both after time to head home and that I was really hungry.  And, ultimately, I re-entered reality with almost an awed feeling of having lived in a land where thoughts and actions blurred.  Where my abilities were a cut above the normal day-to-day levels.

Achieving flow is not something that happens daily for me.  It can be weeks between sessions.  But when I can achieve flow it reminds me how enjoyable problem solving can be.  Writing software is the medium, but not the goal.  I believe it's the satisfaction of finding and implementing solutions that drives flow for me.

Improving OwnCloud Throughput

April 1, 2016 10:22 pm

I have an instance of OwnCloud running from a machine at home that provides file-syncing services for family members.  The OwnCloud data is then encrypted and sent on to CrashPlan for backups.

I recently pointed 1.3 TB of data to sync into OwnCloud.  These are old home videos in raw format with files sized up to 25 GB.  The upload speed was atrocious.  The server is connected to my desktop on a gigabit switch and transfer speeds were topping out at 2.0 MB/s.

Most of the issues people have with poor OwnCloud performance are when uploading many small files which is not my scenario.  But I followed whatever advice I could find.  I modified MariaDB settings and used MySqlTuner to find potential performance gains, which helped a little.  I finally found the backported php-apc package I needed for Ubuntu 14.04 to provide php caching, which helped a little.  But I was still only up to ~4.5 MB/s.

Then I considered my larger system.  The server is on my local gigabit switch, but my desktop is configured using its public domain name, which resolves to my public IP address.  This mean every request from my desktop wasn't just going through the gigabit switch and in to the server.  Instead every request was going through the switch to the router, being NAT-translated, back to the switch, and then to the server.  Due to an issue with my high-performance EdgeRouter Lite, I've been using my old WRT54GL as my router.  And that old thing simply can't handle the load.  It's CPU was maxed out and network throughput was abysmal.

Since I wanted to bypass the router and go directly from desktop, to switch, to server I made an entry in my /etc/hosts file to tell my machine to use the server's internal IP address instead of the public IP address associated with the domain name.  The CPU load on the router is now gone and the OwnCloud throughput increased to ~11 MB/s.  Still pretty awful compared to the ~60 MB/s I get using scp, but substantially better than 2 MB/s.

That speed increase was going strong for a while, but after about 20 minutes it slowed back down to ~4.5 MB/s again.  The router, however, is no longer in the loop, so at least I've removed one potential bottleneck.

I have no idea what the bottleneck is now except for OwnCloud just being abysmally slow.  The server is using a fair bit of CPU, but it's not quite maxed out (usually showing 20% idle overall on a 4-core machine).  IO doesn't seem to be the bottleneck, iotop doesn't show anything being held up.  There's 2 GB of free RAM available, so that doesn't seem to be the issue.

I'm running OwnCloud using Apache with mod-php; but I didn't see anything suggesting that running PHP using fcgi or fastcgi would be better.  Using nginx instead of Apache might be better, but I have no experience configuring nginx so it wouldn't be a short little project to try it.

If anyone has any suggestions on how to get OwnCloud to perform better (particularly when syncing very large files) when connecting over a gigabit local network I'd love to hear them.

Update 4/17/2016

The slowdown from 10MB/s seems to have been the accumulation process running over large files.  Files are transferred in small chunks.  Once all the chunks have been uploaded, the chunks are accumulated and the original file is reconstituted.  During this time the upload speed drops dramatically.

I had a terminal case where a known memory leak in the accumulation process kept causing the reconstitution to fail.  Since the chunks are deleted as they're used during accumulation, the client would then re-upload all the deleted chunks, then the accumulation process would run, fail, and we'd go round and round.  This made it look like performance was worse than it truly was (though not uploading files is sort of a bad thing for a file sync tool to do).

In and attempt to get my files finished without waiting until the memory leak fix is released I split some files to be smaller so the accumulation process doesn't leak as much memory.  I also set a rather absurd 4GB memory limit on the PHP process hoping that will be enough to get it through the large files without failing.

Setting aside the accumulation/reconstitution process, I'm getting a consistent 10MB/s transfer on a 100Mbit switch.  I had to RMA the gigabit switch because it began misbehaving.  I'm hopeful that when the replacement arrives my throughput will increase beyond 10MB/s since that's about the limit of the 100Mbit switch and the gigabit switch wasn't working properly.

Update 4/19/2016

My replacement gigabit switch is in and the transfer rate has gone up to ~19MB/s at times (when no accumulation/reconstitution work is occurring).  There is still plenty of room for improvement, but at least I'm not stuck at 2MB/s anymore.

Converting Http Session Events into Grails 3 Events

October 19, 2015 1:16 pm

Grails 3 introduced a new Events API based on Reactor.  Unfortunately, as far as I can tell, HttpSessionEvents are not natively part of the Grails 3 Events system.  Bringing them in to the fold, however, is pretty easy.  I based this off of Oliver Wahlen's immensely helpful blog post about sending the HttpSessionEvents to a Grails service.

First, let's create our Spring HttpSessionServletListener.  Create this file somewhere in the /src/ path where Grails will find it:

File: .../grailsProject/src/main/groovy/com/example/HttpSessionServletListener.groovy
package com.example

import grails.events.*
import javax.servlet.http.HttpSession
import javax.servlet.http.HttpSessionEvent
import javax.servlet.http.HttpSessionListener

class HttpSessionServletListener implements HttpSessionListener, Events {
  
    // called by servlet container upon session creation
    void sessionCreated(HttpSessionEvent event) {
        notify("example:httpSessionCreated", event.session)
    }

    // called by servlet container upon session destruction
    void sessionDestroyed(HttpSessionEvent event) {
        notify("example:httpSessionDestroyed", event.session)
    }
}

Now register the HttpSessionServletListener as a Spring Bean.  If you don't already have a resources.groovy file, create one and add the following.

.../grailsProject/grails-app/conf/spring/resources.groovy
import org.springframework.boot.context.embedded.ServletListenerRegistrationBean
import com.example.HttpSessionServletListener

beans = {
    
    httpSessionServletListener(ServletListenerRegistrationBean) {
        listener = bean(HttpSessionServletListener)
    }
    
}
// Yes this is the entire file

Now you are all set to listen for the "example:httpSessionCreated" and "example:httpSessionDestroyed" events using the Grails 3 Events API.  "Example" is the namespace of the event, which in my real code I set to the last part of the package name, so I made it match the package name of "example".  Just use something so you don't have to worry about naming collisions.

Here's an example of listening for the events in a standard Grails Controller.  Note that the event handlers are attached after construction, and before the Controller bean is made available, by using the PostConstruct annotation.

.../grailsProject/grails-app/controllers/com/example/ExampleController.groovy
package com.example

import grails.events.*
import javax.annotation.PostConstruct

class ExampleController {
    
    @PostConstruct
    void init() {
        
        on("example:httpSessionCreated") { session ->
            println "sessionCreated: ${session.id}"
        }
        
        on("example:httpSessionDestroyed") { session ->
            println "sessionDestroyed: ${session.id}"
        }
    }
}

NASA Apollo Pictures

October 5, 2015 5:40 pm

Last week NASA released a bunch (over 10,000) of original images from the Apollo missions on their Flickr account.  They're all Public Domain images so anyone can download the originals and use them for anything they like.  I flipped through and picked out my favorites and cleaned them up.  I'll probably get some nice canvas prints made of some of them when Canvas Press has sales.

Here are my top 10 after cleaning them up.  I've uploaded my full versions so you can download them yourself if you want to make a poster or canvas print or something.  Clicking an image will open the full-size version, which you can then save to your computer using right-click -> Save image...21060968314_bcca0b9191_o_kbd 21065336993_765fba69b6_o_kbd 21082003763_9471526a7e_o_kbd 21472205930_d42afbe79a_o_kbd 21492224000_7f7d5991a8_o_kbd 21496319710_4d7bd28063_o_kbd 21653924176_26f5a10ce1_o_kbd 21667234912_ac412e1fb9_o_kbd 21693186921_68e0b6d72f_o_kbd 21912171516_ea0ef12faf_o_kbd