The Little Schemer <http://www.amazon.com/Little-Schemer-Daniel-P-Friedman/dp/02...; and
Clause and Effect <http://www.amazon.com/Clause-Effect-Programming-Working-Prog....
Also a help is that the languages they use ('use' is a more appropriate word here than 'teach') — Scheme and Prolog, respectively — are much more mathematical in nature than typical mainstream languages are.
There's a meme that functional languages must have immutable values and be side-effect free, but I say that any language with first-class functions is a functional programming language.
It is useful and powerful to have strong controls over value mutability and side effects, but I believe these are peripheral to functional programming. Nice to have, but not essential. Even in languages that do have these properties, there is always a way to escape the purity of the paradigm.1
There is a grayscale continuum between truly pure FP languages that you can't actually use for anything practical and languages that are really quite impure but still have some of the FP nature to them:
Book FP: Introductory books on FP languages frequently show only a subset of the language, with all examples done within the language's REPL, so that you never get to see the purely functional paradigm get broken. A good example of this is Scheme as presented in The Little Schemer.
You can come away from reading such a book with the false impression that FP languages can't actually do anything useful.
Haskell: The creators of Haskell went to uncommon lengths to wall off impurity via the famous I/O monad. Everything on one side of the wall is purely functional, so that the compiler can reason confidently about the code.
But the important thing is, despite the existence of this wall, you have to use the I/O monad to get anything useful done in Haskell.2 In that sense, Haskell isn't as "pure" as some would like you to believe. The I/O monad allows you to build any sort of "impure" software you like in Haskell: database clients, highly stateful GUIs, etc.
Erlang: Has immutable values and first-class functions, but lacks a strong wall between the core language and the impure bits.
Erlang ships with Mnesia, a disk-backed in-memory DBMS, which is about as impure as software gets. It's scarcely different in practice from a global variable store. Erlang also has great support for communicating with external programs via ports, sockets, etc.
Erlang doesn't impose any kind of purity policy on you, the programmer, at the language level. It just gives you the tools and lets you decide how to use them.
OCaml and F#: These closely-related multiparadigm languages include both purely functional elements as well as imperative and object-oriented characteristics.
The imperative programming bits allow you to do things like write a traditional for loop with a mutable counter variable, whereas a pure FP program would probably try to recurse over a list instead to accomplish the same result.
The OO parts are pretty much useless without the mutable keyword, which turns a value definition into a variable, so that the compiler won't complain if you change the variable's value. Mutable variables in OCaml and F# have some limitations, but you can escape even those limitations with the ref keyword.
If you're using F# with .NET, you're going to be mutating values all the time, because most of .NET is mutable, in one way or another. Any .NET object with a settable property is mutable, for example, and all the GUI stuff inherently has side-effects. The only immutable part of .NET that immediately comes to mind is System.String.
Despite all this, no one would argue that OCaml and F# are not functional programming languages.
Any truly pure FP language is a toy language or someone's research project.
That is, unless your idea of "useful" is to keep everything in the ghci REPL. You can use Haskell like a glorified calculator, if you like, so it's pure FP.
This is clearly a homework, so I won't give you a straight answer. Instead, I'll point you in the right direction. For starters, split the problem in two procedures:
The first procedure, let's call it counter, receives an element and a list of elements. It traverses the list of elements asking, for each of them, if it's equal to the element passed as parameter. It adds one to the accumulated result if a match is found or continues with the next element if not. The list traversal ends when the null list is reached, and for this the counter returns zero.
The second procedure, called frequency receives the two lists in the question and traverses the first list (the list of elements to compare against). For each of those elements, it calls counter to find out the result, building up a list along the way.
Here's the general structure of the solution, you must fill-in the blanks:
(define (counter ele lst)
(cond ((null? lst)
((equal? ele <???>)
(<???> (counter ele <???>)))
(counter ele <???>))))
(define (frequency els lst)
(if (null? els)
(frequency <???> lst))))
Notice that in counter I'm assuming that the element is being searched at the base level in the list, for instance this won't find the element:
(counter 5 '((5)))
If you have to find matches like the one on the above example, then the problem is a bit more interesting - you'll need to recursively traverse the list of lists in a tree-like fashion. There are countless examples of that in Stack Overflow or elsewhere in Internet; if you're feeling a bit lost I'd recommend you take a look at either The Little Schemer or How to Design Programs, both books will teach you how to grok recursive processes in general.
In the book The Little Schemer, atom? is defined as follows:
(define (atom? x)
(and (not (pair? x))
(not (null? x))))
Noting that null is not considered an atom, as other answers have suggested. In the mentioned book atom? is used heavily, in particular when writing procedures that deal with lists of lists.
 - https://www.amazon.com/Little-Schemer-Daniel-P-Friedman/dp/0...