Geolocation, predicting the location of a tweet from text and other information, is a popular task in NLP, with a huge potential for several social media applications. Typically, the problem is modeled as either multi-class classification or regression. In the first case, the targets are areas previously identified; in the second, the models directly predict geographic coordinates. The former requires discretization of the coordinates, but yields better performance. The latter is potentially more precise and true to the nature of the problem, but often results in worse performance. We propose to combine the two approaches in an attention-based multitask convolutional neural network that jointly learns both discrete locations and continuous geographic coordinates. We evaluate the multitask (MTL) model against single-task ones and prior work. We find that MTL significantly helps performance, reporting large gains on one data set, but also note that the correspondence of the label set to the coordinates has a marked impact on the effectiveness of including regression.