Skip to content

BLLIP on Mavericks #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b825401
Compile on Mavericks with gcc47.
jimwhite Mar 20, 2014
7c55d76
Add my MacPorts setup steps.
jimwhite Mar 20, 2014
f37a1b5
Chase down why -march=native isn't working for gcc after 4.2. Add so…
jimwhite Mar 21, 2014
00b2ea8
The simple switch is just disabling AVX.
jimwhite Mar 21, 2014
95680e9
Missing a negation does bad things to meaning.
jimwhite Mar 21, 2014
39a51ea
MacPorts will gladly install more than one library at a time.
jimwhite Mar 21, 2014
bc3500b
Move Mac-specific settings into Makefile.mac from Makefile.
jimwhite Jul 10, 2014
bd9be5c
Use ESTIMATORNICKNAME=lbfgs-l1c10F1n1p2 to match the other current se…
jimwhite Jul 10, 2014
77837e5
A Git .gitignore copied from .hgignore.
jimwhite Jul 10, 2014
cfaa974
Tidied up .gitignore and add second-stage/nbest and tmp dirs.
jimwhite Jul 10, 2014
13fb007
Set LD_LIBRARY_PATH too.
jimwhite Jul 10, 2014
9343b8c
Add a rule to .LIBPATTERNS so that SParseval will make.
jimwhite Jul 10, 2014
7a4d1e0
Don't set standard make variables if they are already set.
jimwhite Jul 10, 2014
d4d375b
index on BLLIP_FOR_MACOS: bd9be5c Use ESTIMATORNICKNAME=lbfgs-l1c10F1…
jimwhite Jul 10, 2014
42e70d8
WIP on BLLIP_FOR_MACOS: bd9be5c Use ESTIMATORNICKNAME=lbfgs-l1c10F1n1…
jimwhite Jul 10, 2014
37ab98c
Merge commit 'stash' into BLLIP_FOR_MACOS
jimwhite Jul 10, 2014
81bcc86
Seems to build and run everything (including SParseval and the
jimwhite Jul 10, 2014
4bc24de
Remove Mac-specific stuff from Makefile.
jimwhite Jul 10, 2014
6b6ccde
Always use 'gzip -c' to avoid failures caused by extra hardlinks.
jimwhite Jul 11, 2014
baecd5a
Merge branch 'master' of github.com:BLLIP/bllip-parser into BLLIP_ON_…
jimwhite Oct 26, 2014
04ba357
nltk.tree.Tree.parse has apparently been removed/replace by fromstrin…
jimwhite Oct 26, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
*_wrapper.cxx
*.a
*.class
*.d
*.dep
*.o
*.orig
*.py[co]
*.so
*.swp

/SParseval
/tmp

evalb/evalb

build*
dist*

MANIFEST
regression-test-*
tags
TAGS

python/bllipparser/CharniakParser.py
python/bllipparser/JohnsonReranker.py

first-stage/PARSE/evalTree
first-stage/PARSE/parseAndEval
first-stage/PARSE/parseIt
first-stage/PARSE/parser_wrapper.C
first-stage/PARSE/swig/*/build/*
first-stage/PARSE/swig/*/lib/*
first-stage/TRAIN/iScale
first-stage/TRAIN/kn3Counts
first-stage/TRAIN/pSfgT
first-stage/TRAIN/pSgT
first-stage/TRAIN/pTgNt
first-stage/TRAIN/pUgT
first-stage/TRAIN/rCounts
first-stage/TRAIN/selFeats
first-stage/TRAIN/trainRs

/second-stage/nbest

second-stage/programs/*/read-tree.cc
second-stage/programs/eval-beam/main
second-stage/programs/eval-weights/eval-weights
second-stage/programs/features/best-*parses
second-stage/programs/features/count-*features
second-stage/programs/features/extract-*features
second-stage/programs/features/oracle-score
second-stage/programs/features/parallel-extract-nfeatures
second-stage/programs/features/parallel-extract-spfeatures
second-stage/programs/features/reranker_wrapper.C
second-stage/programs/features/swig/*/build/*
second-stage/programs/features/swig/*/lib/*
second-stage/programs/prepare-data/copy-trees-ss
second-stage/programs/prepare-data/prepare-ec-data
second-stage/programs/prepare-data/prepare-ec-data100
second-stage/programs/prepare-data/prepare-new-data
second-stage/programs/prepare-data/ptb
second-stage/programs/wlle/avper
second-stage/programs/wlle/cvlm
second-stage/programs/wlle/cvlm-lbfgs
second-stage/programs/wlle/gavper
second-stage/programs/wlle/oracle
20 changes: 14 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
#
# The following high-level goals may also be useful:
#
# make nbestrain-clean # removes temporary files used in nbesttrain
# make nbesttrain-clean # removes temporary files used in nbesttrain
# make nbest-oracle # oracle evaluation of n-best results
# make features # extracts features from 20-fold parses
# make train-reranker # trains reranker model
Expand Down Expand Up @@ -68,12 +68,12 @@
# Version 4.1 and later gcc permit -march=native, but older
# versions will need -march=pentium4 or -march=opteron
#
# GCCFLAGS = -march=native -mfpmath=sse -msse2 -mmmx -m32
# GCCFLAGS ?= -march=native -mfpmath=sse -msse2 -mmmx -m32

# CFLAGS is used for all C and C++ compilation
#
CFLAGS = -MMD -O3 -Wall -ffast-math -finline-functions -fomit-frame-pointer -fstrict-aliasing $(GCCFLAGS)
LDFLAGS = $(GCCLDFLAGS)

EXEC = time

# for SWIG wrappers, use these flags instead
Expand All @@ -88,11 +88,16 @@ EXEC = time
# LDFLAGS = -g -Wall $(GCCLDFLAGS)
# EXEC = valgrind

CXXFLAGS = $(CFLAGS) -Wno-deprecated
CXXFLAGS ?= $(CFLAGS) -Wno-deprecated
export CFLAGS
export CXXFLAGS
export LDFLAGS

CC ?= gcc
CXX ?= g++
export CC
export CXX

# Building the 20-fold training data with nbesttrain
# --------------------------------------------------

Expand Down Expand Up @@ -517,11 +522,14 @@ train-reranker: $(WEIGHTSFILEGZ)
# This goal estimates the reranker feature weights (i.e., trains the
# reranker).
#
# Don't use auto-renaming as in "gzip foo" because it fails if there is
# more than one hardlink on the file (I'm looking at you Time Machine!).
#
# $(WEIGHTSFILEGZ): $(ESTIMATOR)
$(WEIGHTSFILEGZ): $(ESTIMATOR) $(MODELDIR)/features.gz $(FEATDIR)/train.gz $(FEATDIR)/dev.gz $(FEATDIR)/test1.gz
$(ESTIMATORENV) $(ZCAT) $(FEATDIR)/train.gz | $(EXEC) $(ESTIMATOR) $(ESTIMATORFLAGS) -e $(FEATDIR)/dev.gz -f $(MODELDIR)/features.gz -o $(WEIGHTSFILE) -x $(FEATDIR)/test1.gz
rm -f $(WEIGHTSFILEGZ)
gzip $(WEIGHTSFILE)
gzip -c $(WEIGHTSFILE) >$(WEIGHTSFILEGZ)
rm -f $(WEIGHTSFILE)

########################################################################
# #
Expand Down
68 changes: 68 additions & 0 deletions Makefile.mac
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# To use these defaults set the MAKEFILES environment variable when calling make.
# export MAKEFILES=`pwd`/Makefile.mac

uname_S := $(shell sh -c 'uname -s 2>/dev/null || echo not')

# For Mavericks (and Mountain Lion) I set up gcc using macports:
# sudo port install gcc47
# sudo port select --set gcc mp-gcc47
# sudo port install boost liblbfgs

# Using MacPorts means that we have to override the default include and library locations.
LD_INCLUDE_PATH=/opt/local/include
LD_LIBRARY_PATH=/opt/local/lib

export LD_INCLUDE_PATH
export LD_LIBRARY_PATH

# The SParseval makefile uses a -lm dependency (a bad idea imho) which fails because there
# is no libm.a to be used. This trick works by mapping that to the system's libm.dylib.
# .LIBPATTERNS+=lib%.dylib
# export .LIBPATTERNS

# On Mac OS X using -march=native doesn't seem to work (a compilation error will occur).
# Turns out there is a problem with AVX instructions on OSX for gcc after 4.2.
# http://stackoverflow.com/questions/12016281/g-no-such-instruction-with-avx
# http://mac-os-forge.2317878.n4.nabble.com/gcc-as-AVX-binutils-and-MacOS-X-10-7-td144472.html
# So here's what works for me (with or without the -mfpmath=sse - the default is 387):

GCCFLAGS = -m64 -march=x86-64 -mfpmath=sse -msse -msse2 -msse3 -msse4 -msse4.1 -msse4.2 -mssse3 -I${LD_INCLUDE_PATH}

# Must use export because otherwise second-stage/programs/wlle/Makefile doesn't get the message.
export GCCFLAGS

# CC = condor_compile gcc
CC = gcc
export CC

# CXX = condor_compile g++
CXX = g++
export CXX

# fast options
# Compilation help: you may need to remove -march=native on older compilers.
# GCCFLAGS=-march=native -mfpmath=sse -msse2 -mmmx
FOPENMP=-fopenmp
# CFLAGS=-MMD -O3 -ffast-math -fstrict-aliasing -Wall -finline-functions $(GCCFLAGS) $(FOPENMP)
# LDFLAGS=$(FOPENMP) -L/opt/local/lib

# debugging options
# GCCFLAGS=
# FOPENMP=
# CFLAGS=-MMD -O0 -g $(GCCFLAGS) $(FOPENMP)
# LDFLAGS=-g $(FOPENMP)
# CXXFLAGS=${CFLAGS} -Wno-deprecated

# CFLAGS is used for all C and C++ compilation
#
CFLAGS = -MMD -O3 -Wall -ffast-math -finline-functions -fomit-frame-pointer -fstrict-aliasing $(GCCFLAGS)
export CFLAGS

CXXFLAGS=${CFLAGS} -Wno-deprecated
export CXXFLAGS

LDFLAGS = -L${LD_LIBRARY_PATH} $(GCCLDFLAGS)
export LDFLAGS

# This is a handy place to put a local setting without changing Makefile.
# PENNWSJTREEBANK = /usr/local/data/Penn3/parsed/mrg/wsj
3 changes: 2 additions & 1 deletion parse.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,6 @@
# RERANKDATA=ec50-connll-ic-s5
# RERANKDATA=ec50-f050902-lics5
MODELDIR=second-stage/models/ec50spfinal
ESTIMATORNICKNAME=cvlm-l1c10P1
# ESTIMATORNICKNAME=cvlm-l1c10P1
ESTIMATORNICKNAME=lbfgs-l1c10F1n1p2
first-stage/PARSE/parseIt -l399 -N50 first-stage/DATA/EN/ $* | second-stage/programs/features/best-parses -l $MODELDIR/features.gz $MODELDIR/$ESTIMATORNICKNAME-weights.gz
5 changes: 4 additions & 1 deletion python/bllipparser/ParsingShell.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,13 @@
import nltk.tree
try:
import nltk.draw.tree
have_tree_drawing = False
read_nltk_tree = nltk.tree.Tree.fromstring
have_tree_drawing = True
read_nltk_tree = nltk.tree.Tree.parse
except ImportError:
have_tree_drawing = False
except AttributeError:
have_tree_drawing = False

from bllipparser.RerankingParser import RerankingParser

Expand Down
Binary file modified second-stage/models/ec50spfinal/features.gz
Binary file not shown.
48 changes: 33 additions & 15 deletions second-stage/programs/eval-beam/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -891,24 +891,42 @@ inline std::ostream& operator<< (std::ostream& os, const boost::shared_ptr<T>& s

struct resource_usage { };

#ifndef __i386
#define NO_PROC_SELF_STAT
#endif

#ifdef __APPLE__
#define NO_PROC_SELF_STAT
#endif

#ifdef NO_PROC_SELF_STAT
inline std::ostream& operator<< (std::ostream& os, resource_usage r)
{
return os;
}
#else // Assume we are on a 586 linux
inline std::ostream& operator<< (std::ostream& os, resource_usage r)
{
FILE* fp = fopen("/proc/self/stat", "r");
assert(fp);
int utime;
int stime;
unsigned int vsize;
unsigned int rss;
int result =
fscanf(fp, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %d %d %*d %*d %*d %*d"
"%*u %*u %*d %u %u", &utime, &stime, &vsize, &rss);
assert(result == 4);
fclose(fp);
// s << "utime = " << utime << ", stime = " << stime << ", vsize = " << vsize << ", rss = " << rss
;
// return s << "utime = " << utime << ", vsize = " << vsize;
return os << "utime " << float(utime)/1.0e2 << "s, vsize "
<< float(vsize)/1048576.0 << " Mb.";
// Don't fail if we can't read that (such as on a Mac), just return.
if (fp == NULL) {
return os;
} else {
int utime;
int stime;
unsigned int vsize;
unsigned int rss;
int result =
fscanf(fp, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %d %d %*d %*d %*d %*d"
"%*u %*u %*d %u %u", &utime, &stime, &vsize, &rss);
assert(result == 4);
fclose(fp);
// s << "utime = " << utime << ", stime = " << stime << ", vsize = " << vsize << ", rss = " << rss;
// return s << "utime = " << utime << ", vsize = " << vsize;
return os << "utime " << float(utime)/1.0e2 << "s, vsize "
<< float(vsize)/1048576.0 << " Mb.";
}
}
#endif

#endif // UTILITY_H
2 changes: 1 addition & 1 deletion second-stage/programs/eval-weights/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ SOURCES = best-indices.cc best-parse.cc best-parses.cc compare-models.cc data.c
TARGETS = eval-weights # best-indices best-parse best-parses compare-models pretty-print
OBJECTS = $(patsubst %.l,%.o,$(patsubst %.c,%.o,$(SOURCES:%.cc=%.o)))

CC = gcc
CC ?= gcc

all: $(TARGETS)

Expand Down
41 changes: 26 additions & 15 deletions second-stage/programs/eval-weights/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -882,6 +882,14 @@ inline std::ostream& operator<< (std::ostream& os, const boost::shared_ptr<T>& s
struct resource_usage { };

#ifndef __i386
#define NO_PROC_SELF_STAT
#endif

#ifdef __APPLE__
#define NO_PROC_SELF_STAT
#endif

#ifdef NO_PROC_SELF_STAT
inline std::ostream& operator<< (std::ostream& os, resource_usage r)
{
return os;
Expand All @@ -890,21 +898,24 @@ inline std::ostream& operator<< (std::ostream& os, resource_usage r)
inline std::ostream& operator<< (std::ostream& os, resource_usage r)
{
FILE* fp = fopen("/proc/self/stat", "r");
assert(fp);
int utime;
int stime;
unsigned int vsize;
unsigned int rss;
int result =
fscanf(fp, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %d %d %*d %*d %*d %*d"
"%*u %*u %*d %u %u", &utime, &stime, &vsize, &rss);
assert(result == 4);
fclose(fp);
// s << "utime = " << utime << ", stime = " << stime << ", vsize = " << vsize << ", rss = " << rss
;
// return s << "utime = " << utime << ", vsize = " << vsize;
return os << "utime " << float(utime)/1.0e2 << "s, vsize "
<< float(vsize)/1048576.0 << " Mb.";
// Don't fail if we can't read that (such as on a Mac), just return.
if (fp == NULL) {
return os;
} else {
int utime;
int stime;
unsigned int vsize;
unsigned int rss;
int result =
fscanf(fp, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %d %d %*d %*d %*d %*d"
"%*u %*u %*d %u %u", &utime, &stime, &vsize, &rss);
assert(result == 4);
fclose(fp);
// s << "utime = " << utime << ", stime = " << stime << ", vsize = " << vsize << ", rss = " << rss;
// return s << "utime = " << utime << ", vsize = " << vsize;
return os << "utime " << float(utime)/1.0e2 << "s, vsize "
<< float(vsize)/1048576.0 << " Mb.";
}
}
#endif

Expand Down
48 changes: 33 additions & 15 deletions second-stage/programs/features/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -891,24 +891,42 @@ inline std::ostream& operator<< (std::ostream& os, const boost::shared_ptr<T>& s

struct resource_usage { };

#ifndef __i386
#define NO_PROC_SELF_STAT
#endif

#ifdef __APPLE__
#define NO_PROC_SELF_STAT
#endif

#ifdef NO_PROC_SELF_STAT
inline std::ostream& operator<< (std::ostream& os, resource_usage r)
{
return os;
}
#else // Assume we are on a 586 linux
inline std::ostream& operator<< (std::ostream& os, resource_usage r)
{
FILE* fp = fopen("/proc/self/stat", "r");
assert(fp);
int utime;
int stime;
unsigned int vsize;
unsigned int rss;
int result =
fscanf(fp, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %d %d %*d %*d %*d %*d"
"%*u %*u %*d %u %u", &utime, &stime, &vsize, &rss);
assert(result == 4);
fclose(fp);
// s << "utime = " << utime << ", stime = " << stime << ", vsize = " << vsize << ", rss = " << rss
;
// return s << "utime = " << utime << ", vsize = " << vsize;
return os << "utime " << float(utime)/1.0e2 << "s, vsize "
<< float(vsize)/1048576.0 << " Mb.";
// Don't fail if we can't read that (such as on a Mac), just return.
if (fp == NULL) {
return os;
} else {
int utime;
int stime;
unsigned int vsize;
unsigned int rss;
int result =
fscanf(fp, "%*d %*s %*c %*d %*d %*d %*d %*d %*u %*u %*u %*u %*u %d %d %*d %*d %*d %*d"
"%*u %*u %*d %u %u", &utime, &stime, &vsize, &rss);
assert(result == 4);
fclose(fp);
// s << "utime = " << utime << ", stime = " << stime << ", vsize = " << vsize << ", rss = " << rss;
// return s << "utime = " << utime << ", vsize = " << vsize;
return os << "utime " << float(utime)/1.0e2 << "s, vsize "
<< float(vsize)/1048576.0 << " Mb.";
}
}
#endif

#endif // UTILITY_H
Loading