Classical computational learning models like Gold's (1967) identification in the limit paradigm and PAC learning require learning under all possible probability distributions over data samples. By modifying PAC learning to restrict the set of distributions, it is possible to show that different classes of languages are learnable than in the distribution free framwork. I will explore some recent work on distribution based language learning. Clark and Thollard (2004) prove that when the set of distributions is limited to those produced by Probabilistic Deterministic Finite State Automata, then the class of regular languages is PAC learnable from positive evidence only. Clark (2006) demonstrates a similar result for an interesting subclass of context free languages when the set of distributions corresponds to a specified set of Probabilistic Context Free Grammars. I will then discuss how this approach might be used to increase the range of evidence available for learning. Clark and Lappin (2009) suggest that an appropriate restriction on the set of possible distributions for PAC learning could provide the basis for a stochastic model of indirect negative evidence.